add NEOCC python interface to astroquery library #2254

D-arioSpace · 2021-12-22T12:05:26Z

This is a python interface to data that the Near Earth Objects Coordination Centre (NEOCC) makes available through its API service (https://neo.ssa.esa.int/computer-access).

pep8speaks · 2021-12-22T12:05:35Z

Hello @D-arioSpace! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2024-02-07 15:54:47 UTC

ceb8

Hello! This looks like a great new addition to astroquery!

Before I do a more thorough review, there are a couple of important changes that need to be implemented. You are using two packages: pandas and parse, that are not part of the astroquery dependency list (https://astroquery.readthedocs.io/en/latest/#requirements). We ask that new modules not depend on packages not already required, except in limited circumstances. Looking through your code I don't think you are doing anything that would be impossible without pandas or parse. I would be happy to help you rework your code to remove those dependencies, particularly figuring how to replace the pandas functionality with astropy.tables.

D-arioSpace · 2022-03-11T07:21:59Z

Dear @ceb8, it's a pleasure to meet you.
Sorry for the delay in replying. Please take a look at this latest pull where we removed the dependency form the parse library.
Unfortunately we cannot remove the pandas dependency because we do not have the necessary resources to implement the changes, it requires too much effort. Hopefully this may be one of these "limited circumstances" you speak of. In this case it's not a question of functionality (apparently there is nothing that astropy.tables can't do and panda does, but I can't be 100% sure either). It is just a matter of our internal resources that we do not have at the moment.

I perfectly understand that this can introduce dependencies and extra-feature that you would like to avoid. Please take also into account that, apart from the lack of resources, there are also other considerations that push us to this choice:

there are users of some institutions who already use the library. We would like to encourage these users to use the library through astroquery (instead of the stand-alone version we provided) but with as little effort as possible for them. By using the library as it is now, they should just change their current import to from astroquery.esa.neocc import neocc in their code, and everything else will be transparent to them.
all users who now use the library already have pandas pre-installed. For those who don't have it or start with a very basic version of python, the installation can easily be done separately.

Taking all this into consideration, please let us know if we can focus on the integration into astroquery without considering the pandas replacement.

ceb8 · 2022-03-15T18:13:06Z

Hi @D-arioSpace,

Unfortunately we really cannot allow a pandas dependency into our codebase, returning Astropy Tables to users is a very basic requirement in our API specs. However, we can provide the labor to make that change. If this course of action is acceptable, I will implement those changes within this PR.

There is one other issue that needs to be resolved before we proceed, which is the matter of the copyright notice you have on your code:

© Copyright [European Space Agency][2022]
All rights reserved

Astroquery is released under a 3-clause BSD style license (see https://astroquery.readthedocs.io/en/latest/license.html), and individual modules cannot be released under different licenses. Would it be OK to remove this notice? We have other modules contributed by ESA developers, so I believe the 3-clause BSD license should be acceptable from that standpoint.

Let me know what you think. If I get your OK I will, implement the necessary changes and then ping you so you can review my work.

Thanks for your patience.

D-arioSpace · 2022-09-08T11:36:48Z

Hello @ceb8!

First of all, sorry for the delay in replying, unfortunately we had to resolve some legal issues due to the change of copyright requested.

From now on, you can change the license and insert a 3-clause BSD style license anywhere in the code.

Also, I would like to point out that we carried out (within this fork) another push containing the implementation of the new API provided by NEOCC web portal (https://neo.ssa.esa.int/PSDB-portlet/download?file=past_impactors_list), this time using directly astropy.table (see parse_impacted function).

Thanks again for your help. We are looking forward to see this library integrated into astroquery!

ceb8 · 2022-09-08T14:01:04Z

Hi @D-arioSpace!

Thanks for your response, and no worries about the delay. That's very good news about the copyright issue.

I am a bit confused about the new implementation you mention however, since I still see pandas imported all over. Do you mean this to be an example implementation for me? If so, thank you very much and I will get started on the work shortly. Just want to make sure not to step on your toes.

D-arioSpace · 2022-09-09T06:48:15Z

Hello @ceb8

you're right, it is just an example, which we hope to have implemented well (we did this to become familiar with the astropy.table data structure, taking advantage of the fact that this specific function had to be implemented almost from scratch). Feel free to improve it.
And yes, pandas is still in all other functions, which need to be converted.
Thanks again for your help. :)

Dario

ceb8 · 2022-09-09T10:11:22Z

@D-arioSpace Excellent, I am in this middle of one coding project, so I will start this one as soon as that one wraps up.

ceb8 · 2022-11-08T17:14:50Z

@D-arioSpace I'm pushing the de-pandafied lists.py functions so that you can go ahead and take a look at them while I am working on the tables.

D-arioSpace · 2022-11-28T17:25:53Z

Hi @ceb8! I had a look to the code and it looks fine. I also tested the main functionality within the list.py module and all of them are working!
I would just remove the extra white spaces between the asteroid's number and its name (1) for all lists. I also saw that in the two 'priority' lists the time is in isot format, while in all the others it is iso (2). Both comments are not so relevant, but in case you think it's useful to implement them, I'll leave them to you.

I leave below the steps to reproduce the behavior I mentioned:
(1)

ast = neocc.query_list(list_name='nea_list')
ast[1]
<Row index=1>
      NEA       
     str27      
----------------
719       Albert

(2)

out1 = neocc.query_list(list_name='priority_list')
out1[0][7]
<Time object: scale='utc' format='isot' value=2022-11-30T00:00:00.000>
out2 = neocc.query_list(list_name='risk_list')
out2[0][3]
<Time object: scale='utc' format='iso' value=2056-12-12 21:39:00.000>

D-arioSpace · 2023-01-13T15:04:06Z

Hi @ceb8 just to inform you I added a very small fix to tabs.py and to the 433.clolin file as the latter changed and the former has been adapted to that change

ceb8 · 2023-01-27T16:25:05Z

@D-arioSpace I’ve fully removed pandas from this module. In doing so I also tried to regularise the output formats, so now instead of query-specific objects the outputs are all either a single astropy table or a list of them.

I haven’t updated the documentation or the tests, so this is still very much a work in progress. But I wanted to make sure we are all on the same page in terms of the functionality before working on that.

I have a couple of big picture questions that @bsipocz and @keflavich may want to also weigh in on:

query_object sometimes returns a single Astropy table, and sometimes a list of them, would it be better to separate out the list-returning queries into a separate function? (or alternately always return a list, but sometimes the list is only of length 1)
Two table types for query_object (orbit properties and ephemeradies) require additional arguments, might it make sense to split these queries out into seperate functions?
This is mainly for @bsipocz and @keflavich, this module does not use the async_to_sync paradigm. It would not be too hard to update it do that, but it seems like we are moving away from that structure, what do you think about changing it vs leaving as is?
Generally how do you feel about the functionality, and output formats? I put all the non-tabular information in the meta property of the table(s):

impacts_table = neocc.query_object(name='1979XB', tab='impacts')
pd_tble.meta

A lot of the datetimes produce dubious year warnings. I looked up why, and it seems basically unavoidable with dates too far in the past/future using UTC, what are your opinions on how to handle this?

D-arioSpace · 2023-04-14T15:20:57Z

Hi @ceb8 sorry for the big delay in answering and many thanks for the update of this interface!

Regarding your big picture questions, I personally don't have any issues with the current behavior of query_object returning either a single table or a list of tables, but it would be good to get feedback from @bsipocz and @keflavich as well, since they are the experts in this area.

Regarding the two table types that require additional arguments, I'm okay with leaving them as is for now, but if anyone else has strong opinions on this, we can revisit it later.

In terms of functionality and output formats, I don't have any major issues with them. However, I did notice an issue with error handling when calling query_object with an invalid tab name, for example:
temp = neocc.query_object(name='1979XB', tab='s')
it returns me a non-clean message like:
KeyError: 'impactsPlease introduce a valid tab name. valid tabs names are: , close_approachesPlease introduce a valid tab name. valid tabs names are: , observationsPlease introduce a valid tab name. valid tabs names are: , physical_propertiesPlease introduce a valid tab name. valid tabs names are: , orbit_propertiesPlease introduce a valid tab name. valid tabs names are: , ephemeridesPlease introduce a valid tab name. valid tabs names are: , summary'.

The error message that is returned is not very clean and includes multiple repeated messages.

Additionally, I encountered an error when trying to retrieve observations for object "433" (i.e. neocc.query_object(name='433', tab='observations') ). It looks like there may be a bug in the code related to radar observations and the "AsteroidObservation" class, in the “_get_rad_into” method. It would be helpful to investigate this further and see if we can resolve the issue (please let me know if there's anything I can do to help. I anticipate having more time available in the near future to dedicate to this project, so please don't hesitate to reach out to me if there's anything I can assist with.).

Overall, thanks for the updates! Let me know if there's anything I can do to help.

D-arioSpace · 2023-06-21T14:10:01Z

Hi @ceb8 I was wondering if the work is progressing smoothly or if I could assist you in clarifying any doubts or concerns you might have.

ceb8 · 2023-06-30T13:12:29Z

@D-arioSpace Thanks for checking in, I'm just getting back to this after a busy period. Seems pretty straightforward. I'll leave things as they are for now, fix the bugs, and move forward updating the testing to match the new functionality.

D-arioSpace · 2023-06-30T13:36:21Z

Thank you very much @ceb8, I confirm my availability to clarify any doubts you may have about files and functions. I hope to hear from you soon then! Thanks again.

D-arioSpace · 2023-07-05T13:36:27Z

Hi @ceb8 sorry for bothering you again. I'm working with NEOCC data for some projects and I was wondering if you believe it would be possible to have the code integrated into astroquery by September. This integration would greatly benefit these other projects, as they would be able to utilise the astroquery package directly instead of relying on the "standalone" version for NEOCC data.

Thank you again for your time and effort!

ceb8 · 2023-07-07T15:22:33Z

@D-arioSpace That seems like a reasonable timeframe.

ceb8 · 2023-07-21T14:41:24Z

@bsipocz @keflavich Could I get your opinions on the big picture questions I mentioned above?

ceb8 · 2023-07-21T14:46:01Z

@D-arioSpace Looking into the neocc.query_object(name='433', tab='observations') error, and I can't reproduce it. I get a table that looks superficially fine to me as output, can you describe what you're seeing?

keflavich · 2023-07-21T15:10:43Z

query_object sometimes returns a single Astropy table, and sometimes a list of them, would it be better to separate out the list-returning queries into a separate function? (or alternately always return a list, but sometimes the list is only of length 1)

Yes, that's a good idea if there are two distinct routes that both make sense. For some services, like Vizier, queries are supposed to return lists of catalogs, so it just makes sense that that's the return.

Two table types for query_object (orbit properties and ephemeradies) require additional arguments, might it make sense to split these queries out into seperate functions?

No strong opinion here.

This is mainly for @bsipocz and @keflavich, this module does not use the async_to_sync paradigm. It would not be too hard to update it do that, but it seems like we are moving away from that structure, what do you think about changing it vs leaving as is?

I'm fine with the approach that is less work now. async_to_sync has proved useful in some debug cases, but I think the asynchronous queries have not seen widespread adoption and do not have particularly broad use cases. Since we don't really document how to use those, no one is using them.

Generally how do you feel about the functionality, and output formats? I put all the non-tabular information in the meta property of the table(s):

At a glance, that seems fine to me, but I'm not a NEOCC user - it would be better to find people using the data and get their opinions. From a structural perspective, this seems fine.

A lot of the datetimes produce dubious year warnings. I looked up why, and it seems basically unavoidable with dates too far in the past/future using UTC, what are your opinions on how to handle this?

Maybe we should silence those warnings in this context, since this is a context where far past/future dates are expected.

ceb8 · 2023-07-21T15:13:34Z

Thanks @keflavich! To your first point, I can't tell which paradigm you are advocating. Always returning a list?

keflavich · 2023-07-21T15:17:19Z

I'm ambivalent, actually - always returning a list is right if we're sticking with one function, but I'm also OK with splitting into two functions if there is an understandable difference between two approaches. I would avoid having one function return lists sometimes and tables other times.

D-arioSpace · 2023-08-02T08:59:55Z

Hello @ceb8 ,
So, I have performed other tests, and most of them are okay.
However, I noticed some errors in some calls (but not in the one I mentioned in the previous comment) specifically the call:

zzz = neocc.query_object(name='2023DW', tab='close_approaches')

returns me an error. I investigated it a bit, and it is due to an extra line NEOCC added at the beginning of the close approach file of each asteroid. To solve it, you can tell the program to skip the first line in the pd.read_csv, in class CloseApproaches in the static method clo_appr_parser(data_obj):

            df_close_appr = pd.read_csv(df_impacts_d,
                                        delim_whitespace=True, **skiprows=1**)

Additionally, we have a similar issue with the call:

zzz = neocc.query_object(name='433 Eros', tab='orbit_properties', orbital_elements="keplerian", orbit_epoch="middle");

which returns me the following error:

zzz = neocc.query_object(name='433 Eros', tab='orbit_properties', orbital_elements="keplerian", orbit_epoch="middle");
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dario/dev/astroqyery_intergation/astroquery/astroquery/esa/neocc/core.py", line 477, in query_object
    neocc_obj = tabs.parse_orbital_properties(resp_str)
  File "/home/dario/dev/astroqyery_intergation/astroquery/astroquery/esa/neocc/tabs.py", line 655, in parse_orbital_properties
    section = re.search("(\w+ parameters)", body_lst[3]).groups()[0]
AttributeError: 'NoneType' object has no attribute 'groups'

This time I'm not sure where the error come from, I'm still investigating it.

Furthermore, we have recently incorporated an additional column into the close approach lists (both the recent and upcoming lists). This new column, a float value, indicates the frequency of a specific event based on the miss distance from Earth, the size of the asteroid, and its relative velocity. It is important to note that both calls still function properly, so if you believe it is worthwhile to include this information, feel free to do so.

bsipocz · 2023-08-30T18:00:10Z

@ceb8 - There is a lot of commits here, I would even guess duplicates, certainly duplicated commit messages, so I wonder whether you could cleanup the PR, rebase and squash things out a bit?

Also, I wonder whether the total diff of ~150k is necessary, I assume it's due to bundled data files, but should they be this many/big?

codecov · 2023-09-07T10:57:45Z

Codecov Report

Attention: Patch coverage is 82.53452% with 215 lines in your changes missing coverage. Please review.

Project coverage is 69.89%. Comparing base (c5b7a81) to head (8463247).
Report is 175 commits behind head on main.

Files with missing lines	Patch %	Lines
astroquery/esa/neocc/test/test_neocc_remote.py	18.51%	176 Missing ⚠️
astroquery/esa/neocc/tabs.py	90.40%	31 Missing ⚠️
astroquery/esa/neocc/core.py	91.83%	4 Missing ⚠️
astroquery/esa/neocc/test/setup_package.py	50.00%	2 Missing ⚠️
astroquery/esa/neocc/test/test_neocc.py	99.61%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2254      +/-   ##
==========================================
+ Coverage   69.10%   69.89%   +0.79%     
==========================================
  Files         232      241       +9     
  Lines       19632    20863    +1231     
==========================================
+ Hits        13566    14582    +1016     
- Misses       6066     6281     +215

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ceb8 · 2024-02-06T17:17:08Z

Rebased and addressed @bsipocz's comments.

D-arioSpace · 2024-02-14T16:01:56Z

Hi @ceb8 I just had a look at the changes you introduced. Thank you! I also see that in the last push, the tests all passed. At this point I just want to ask if there is anything else to do or if you know approximately when the package will be integrated, just to inform some users and collogues who are using the stand_alone version to switch to the astroquery integrated version. Thank you again for your effort :)

THoffmann281 · 2024-03-08T12:37:34Z

@D-arioSpace thank you for the work!
I just would like to express my interest for a release version of the NEOCC python interface in the astroquery library as I am currently using it for some research work on NEOs. I hope that you and the reviewers @bsipocz @Galaksio can find a final solution anytime soon. It would be really nice to have the changes approved and merge it into astroquery, so there is a release that I could cite as reference! :)

XePeleato · 2024-12-09T13:25:26Z

Hi everyone, I just rebased everything and added a couple of small changes. Please let me know if there are any comments pending to be addressed.

XePeleato · 2025-03-10T13:50:44Z

Hi @bsipocz, I believe this is ready from our side. Could I please ask you to provide feedback on it? Docs are failing, but it doesn't seem to be related to us.

Thanks!

bsipocz · 2025-03-14T05:01:40Z

Thank you! I'll come back to this by the end of the month.

XePeleato · 2025-05-13T08:10:17Z

Hi @bsipocz, I just wanted to follow up on this PR – is there anything I can do to help move it forward? I'm happy to make any changes if needed. Thanks again!

bsipocz · 2025-05-13T13:53:33Z

Yes, I'm sorry it's being stuck in a review queue. I'm trying to wrap a review up by the end of the month.

ceb8 requested changes Jan 31, 2022

View reviewed changes

ceb8 force-pushed the main branch from d982199 to 8ff974b Compare January 27, 2023 15:39

bsipocz added the New Service label May 8, 2023

ceb8 force-pushed the main branch from 66bdb72 to 6bacfa6 Compare September 6, 2023 14:23

ceb8 approved these changes Sep 7, 2023

View reviewed changes

ceb8 force-pushed the main branch 3 times, most recently from d3a087a to 30e854f Compare February 7, 2024 15:54

XePeleato force-pushed the main branch from 30e854f to c3398f0 Compare December 9, 2024 13:17

SSA Bot and others added 16 commits March 10, 2025 14:31

Initial NEOCC functionality

79c04f3

Adding test files

f3724bf

remote tests

f11d47e

removing pandas from lists functions

3a8c81a

Update tabs.py and the clolin file

27b15cb

removing pandas from tabs.py

0bb5ccf

temporarily removing from solarsystem init to avoid conflict

7c21a9d

completing the remote tests

65d821b

completing the monkey-patched tests

a56aede

small test fixes

d970f83

Updating doc strings

452ab6d

adding remote data annotations

b15035c

updating docs

3db69e5

Fixs for building docs etc

893421c

addressing comments on docs

0ea5403

Add CAI description & Fix close approach parsing for impacted objects

6fe0dbb

XePeleato force-pushed the main branch from c3398f0 to 6fe0dbb Compare March 10, 2025 13:32

Remove unsupported note section from docstring

8463247

Uh oh!

add NEOCC python interface to astroquery library #2254

Are you sure you want to change the base?

add NEOCC python interface to astroquery library #2254

Uh oh!

Conversation

D-arioSpace commented Dec 22, 2021

Uh oh!

pep8speaks commented Dec 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2024-02-07 15:54:47 UTC

Uh oh!

ceb8 left a comment

Choose a reason for hiding this comment

Uh oh!

D-arioSpace commented Mar 11, 2022

Uh oh!

ceb8 commented Mar 15, 2022

Uh oh!

D-arioSpace commented Sep 8, 2022

Uh oh!

ceb8 commented Sep 8, 2022

Uh oh!

D-arioSpace commented Sep 9, 2022

Uh oh!

ceb8 commented Sep 9, 2022

Uh oh!

ceb8 commented Nov 8, 2022

Uh oh!

D-arioSpace commented Nov 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

D-arioSpace commented Jan 13, 2023

Uh oh!

ceb8 commented Jan 27, 2023

Uh oh!

D-arioSpace commented Apr 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

D-arioSpace commented Jun 21, 2023

Uh oh!

ceb8 commented Jun 30, 2023

Uh oh!

D-arioSpace commented Jun 30, 2023

Uh oh!

D-arioSpace commented Jul 5, 2023

Uh oh!

ceb8 commented Jul 7, 2023

Uh oh!

ceb8 commented Jul 21, 2023

Uh oh!

ceb8 commented Jul 21, 2023

Uh oh!

keflavich commented Jul 21, 2023

Uh oh!

ceb8 commented Jul 21, 2023

Uh oh!

keflavich commented Jul 21, 2023

Uh oh!

D-arioSpace commented Aug 2, 2023

Uh oh!

bsipocz commented Aug 30, 2023

Uh oh!

codecov bot commented Sep 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ceb8 commented Feb 6, 2024

Uh oh!

D-arioSpace commented Feb 14, 2024

Uh oh!

THoffmann281 commented Mar 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

XePeleato commented Dec 9, 2024

Uh oh!

XePeleato commented Mar 10, 2025

Uh oh!

bsipocz commented Mar 14, 2025

Uh oh!

pep8speaks commented Dec 22, 2021 •

edited

Loading

D-arioSpace commented Nov 28, 2022 •

edited

Loading

D-arioSpace commented Apr 14, 2023 •

edited

Loading

codecov bot commented Sep 7, 2023 •

edited

Loading

THoffmann281 commented Mar 8, 2024 •

edited

Loading