-
Notifications
You must be signed in to change notification settings - Fork 229
WIP: Add return_table helper function #1336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -267,3 +269,48 @@ def args_in_kwargs(args, kwargs): | |||
If one of the required arguments is in ``kwargs``. | |||
""" | |||
return any(arg in kwargs for arg in args) | |||
|
|||
|
|||
def return_table(result, data_format, format_parameter, df_columns): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks promising! A few comments:
- I think you will need to use this with one of the existing functions so that we can test it out in this PR. In my opinion, grdtrack would be a good option.
- My preference would be to add a
format_options
parameter forreturn_table
. This could take a list, with defaults values including numpy, pandas, and str. This way, if it doesn't for example make sense to return string values then that could not be given as an option in the individual function documentation/implementation. - table-like options for input are now
str or numpy.ndarray or pandas.DataFrame or xarray.Dataset or geopandas.GeoDataFrame
. It would be nice if all of these were output options too. - I would prefer a more description argument for requested data format than a|d|s. Something like
numpy
|pandas
|str
seems more readable. - I think
df_columns
needs to be optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3. table-like options for input are now `str or numpy.ndarray or pandas.DataFrame or xarray.Dataset or geopandas.GeoDataFrame`. It would be nice if all of these were output options too.
Sounds good; I'll have to get smarter on the last two but I don't see why it should be a problem.
4. I would prefer a more description argument for requested data format than **a**|**d**|**s**. Something like `numpy`|`pandas`|`str` seems more readable.
I like the idea of keeping it short, especially when there is a default option (I anticipate it being a numpy array) and the strings are not also the same word as Python modules or variable types. But I understand how the single letters could be confusing.
5. I think `df_columns` needs to be optional.
Since this is a helper function, I envisioned that the argument for df_columns
would be set up in the GMT function that is using it, such as using ["x", "y", "z"]
when calling this function inside of grd2xyz
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3. table-like options for input are now `str or numpy.ndarray or pandas.DataFrame or xarray.Dataset or geopandas.GeoDataFrame`. It would be nice if all of these were output options too.
Added in c8cef5e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I would prefer a more description argument for requested data format than a|d|s.
Something likenumpy
|pandas
|str
seems more readable.
I like the idea of keeping it short, especially when there is a default option (I anticipate it being a numpy array) and the strings are not also the same word as Python modules or variable types. But I understand how the single letters could be confusing.
I'll have to agree with Meghan that long descriptive names like numpy
|pandas
|str
are preferable 🙂
Added
|
I can't see why this is failing deployment, but it is running into a |
@weiji14 I tried adding in |
I'm unable to figure out why this is causing the deployments to fail. My guess is it has something to do with trying to incorporate it into |
@@ -10,6 +10,9 @@ | |||
from collections.abc import Iterable | |||
from contextlib import contextmanager | |||
|
|||
import geopandas as gpd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This import geopandas
line shouldn't be here at the top-level. I'd suggest importing geopandas in the return_table
function itself if you need it, and only under elif data_format=="geopandas"
.
import geopandas as gpd |
points, | ||
grid, | ||
data_format="d", | ||
df_columns=["longitude", "latitude", "z-value"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it's a good idea to have default column names, especially for the x (longitude) and y (latitude) columns since someone passing in a pandas.DataFrame
table with existing column names would have their column names overridden by this default.
Closing this PR; I think there may be better ways to return table-like data, but I think the best move is to wrap some more of those functions that do return table like data before trying to figure out how to best refactor them to use a helper function. |
This is a proof-of-concept pull request for the
return_table
helper function inutils.py
. It accepts the default string output of a table from the GMT C API and can return either a string, numpy array, or pandas DataFrame. This is a possible answer for #1318.Fixes #
Reminders
make format
andmake check
to make sure the code follows the style guide.doc/api/index.rst
.Slash Commands
You can write slash commands (
/command
) in the first line of a comment to performspecific operations. Supported slash commands are:
/format
: automatically format and lint the code/test-gmt-dev
: run full tests on the latest GMT development version