WIP: Add return_table helper function #1336

willschlitzer · 2021-06-16T06:42:36Z

This is a proof-of-concept pull request for the return_tablehelper function in utils.py. It accepts the default string output of a table from the GMT C API and can return either a string, numpy array, or pandas DataFrame. This is a possible answer for #1318.

Fixes #

Reminders

Run make format and make check to make sure the code follows the style guide.
Add tests for new features or tests that would have caught the bug that you're fixing.
Add new public functions/methods/classes to doc/api/index.rst.
Write detailed docstrings for all functions/methods.
If adding new functionality, add an example to docstrings or tutorials.

Slash Commands

You can write slash commands (/command) in the first line of a comment to perform
specific operations. Supported slash commands are:

/format: automatically format and lint the code
/test-gmt-dev: run full tests on the latest GMT development version

maxrjones · 2021-06-17T15:40:06Z

pygmt/helpers/utils.py

@@ -267,3 +269,48 @@ def args_in_kwargs(args, kwargs):
        If one of the required arguments is in ``kwargs``.
    """
    return any(arg in kwargs for arg in args)
+
+
+def return_table(result, data_format, format_parameter, df_columns):


Looks promising! A few comments:

I think you will need to use this with one of the existing functions so that we can test it out in this PR. In my opinion, grdtrack would be a good option.

My preference would be to add a format_options parameter for return_table. This could take a list, with defaults values including numpy, pandas, and str. This way, if it doesn't for example make sense to return string values then that could not be given as an option in the individual function documentation/implementation.

table-like options for input are now str or numpy.ndarray or pandas.DataFrame or xarray.Dataset or geopandas.GeoDataFrame. It would be nice if all of these were output options too.

I would prefer a more description argument for requested data format than a|d|s. Something like numpy|pandas|str seems more readable.

I think df_columns needs to be optional.

3. table-like options for input are now `str or numpy.ndarray or pandas.DataFrame or xarray.Dataset or geopandas.GeoDataFrame`. It would be nice if all of these were output options too.

Sounds good; I'll have to get smarter on the last two but I don't see why it should be a problem.

4. I would prefer a more description argument for requested data format than **a**|**d**|**s**. Something like `numpy`|`pandas`|`str` seems more readable.

I like the idea of keeping it short, especially when there is a default option (I anticipate it being a numpy array) and the strings are not also the same word as Python modules or variable types. But I understand how the single letters could be confusing.

5. I think `df_columns` needs to be optional.

Since this is a helper function, I envisioned that the argument for df_columns would be set up in the GMT function that is using it, such as using ["x", "y", "z"] when calling this function inside of grd2xyz.

3. table-like options for input are now `str or numpy.ndarray or pandas.DataFrame or xarray.Dataset or geopandas.GeoDataFrame`. It would be nice if all of these were output options too.

Added in c8cef5e

I would prefer a more description argument for requested data format than a|d|s.
Something like numpy|pandas|str seems more readable.

I like the idea of keeping it short, especially when there is a default option (I anticipate it being a numpy array) and the strings are not also the same word as Python modules or variable types. But I understand how the single letters could be confusing.

I'll have to agree with Meghan that long descriptive names like numpy|pandas|str are preferable 🙂

willschlitzer · 2021-06-22T07:30:02Z

Added return_table() to grdtrack. I have a few issues:

I'm not sure how to inform the user of the ValueError (lines 307-308) if it can't convert a value in the table to a float (such as a title or section header). My initial reaction was to print that it cannot convert the value to float, but that is a TON of prints, as every line of text is split up by word.
Specifically for grdtrack, I removed the use of data_kind since column names don't need to be specified by the user (but can be with the df_columns= parameter), but should there still be a check of the information format of the table?

…ble()

willschlitzer · 2021-06-28T06:49:06Z

I can't see why this is failing deployment, but it is running into a ModuleNotFoundError: No module named 'geopandas' for the Python 3.7 CI job. I asked the question in #1354 about making geopandas a dependency, and assume the fix for both pull requests will be the same.

willschlitzer · 2021-06-28T22:21:31Z

@weiji14 I tried adding in gpd = pytest.importorskip("geopandas") but am still running into a ModuleNotFound error for geopandas on Python 3.7/NumPy 1.17. Any idea how to fix this to make the tests pass?

willschlitzer · 2021-06-29T21:15:50Z

I'm unable to figure out why this is causing the deployments to fail. My guess is it has something to do with trying to incorporate it into grdtrack (as deployment and tests started to fail after that change) but I'm not sure why that is causing a problem.

weiji14 · 2021-06-30T02:29:09Z

pygmt/helpers/utils.py

@@ -10,6 +10,9 @@
 from collections.abc import Iterable
 from contextlib import contextmanager

+import geopandas as gpd


This import geopandas line shouldn't be here at the top-level. I'd suggest importing geopandas in the return_table function itself if you need it, and only under elif data_format=="geopandas".

Suggested change

import geopandas as gpd

weiji14 · 2021-06-30T02:35:24Z

pygmt/src/grdtrack.py

+    points,
+    grid,
+    data_format="d",
+    df_columns=["longitude", "latitude", "z-value"],


Not sure if it's a good idea to have default column names, especially for the x (longitude) and y (latitude) columns since someone passing in a pandas.DataFrame table with existing column names would have their column names overridden by this default.

willschlitzer · 2021-07-19T15:22:39Z

Closing this PR; I think there may be better ways to return table-like data, but I think the best move is to wrap some more of those functions that do return table like data before trying to figure out how to best refactor them to use a helper function.

add return_table() to utils.py

e668ec1

willschlitzer added the feature Brand new feature label Jun 16, 2021

willschlitzer added this to the 0.5.0 milestone Jun 16, 2021

willschlitzer self-assigned this Jun 16, 2021

add import statements for pandas and numpy

6d8b2e0

vercel bot temporarily deployed to Preview June 16, 2021 06:50 Inactive

maxrjones mentioned this pull request Jun 17, 2021

Consistent table-like output for PyGMT functions/methods #1318

Closed

13 tasks

maxrjones reviewed Jun 17, 2021

View reviewed changes

willschlitzer added 3 commits June 22, 2021 07:47

update return_table import

10f8238

add exception handling for return_table

3546ef6

update grdtrack.py and test_grdtrack.py to use return_table

8e59e57

vercel bot temporarily deployed to Preview June 22, 2021 07:22 Inactive

willschlitzer added 2 commits June 22, 2021 08:36

format grdtrack gallery example for using df_columns parameter

88832db

add test_grdtrack_output_types()

3a98346

maxrjones mentioned this pull request Jun 23, 2021

Improve input/output options for blockm* and grdtrack #1099

Closed

2 tasks

vercel bot temporarily deployed to Preview June 25, 2021 07:54 Inactive

willschlitzer added 3 commits June 25, 2021 09:06

add xarray and geopandas output options for return_table()

c8cef5e

add tests for xarray and GeoDataFrame in test_grdtrack.py

46f2cd4

add tests for xarray and GeoDataFrame as output options for return_ta…

b2a388f

…ble()

vercel bot temporarily deployed to Preview June 25, 2021 08:13 Inactive

change assert type to assert isinstance

d7084e5

vercel bot temporarily deployed to Preview June 25, 2021 08:20 Inactive

move geopandas import in test_grdtrack.py to use importskip

935bdaa

vercel bot temporarily deployed to Preview June 28, 2021 21:57 Inactive

Merge branch 'master' into table-output-function

eae2eef

vercel bot temporarily deployed to Preview June 28, 2021 22:05 Inactive

remove geopandas from available return table options

c65fb0e

add test for failure for invalid output type

61f6126

vercel bot temporarily deployed to Preview June 29, 2021 21:07 Inactive

willschlitzer mentioned this pull request Jun 29, 2021

Wrap grd2xyz #1284

Merged

5 tasks

weiji14 reviewed Jun 30, 2021

View reviewed changes

willschlitzer changed the title ~~Add return_table helper function~~ WIP: Add return_table helper function Jul 12, 2021

willschlitzer closed this Jul 19, 2021

seisman modified the milestones: 0.5.0, 0.4.1 Jul 25, 2021

weiji14 removed this from the 0.4.1 milestone Aug 10, 2021

weiji14 deleted the table-output-function branch August 10, 2021 21:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: Add return_table helper function #1336

WIP: Add return_table helper function #1336

Uh oh!

willschlitzer commented Jun 16, 2021

Uh oh!

maxrjones Jun 17, 2021

Uh oh!

willschlitzer Jun 17, 2021

Uh oh!

willschlitzer Jun 25, 2021 •

edited

Loading

Uh oh!

weiji14 Jun 30, 2021

Uh oh!

willschlitzer commented Jun 22, 2021

Uh oh!

willschlitzer commented Jun 28, 2021

Uh oh!

willschlitzer commented Jun 28, 2021

Uh oh!

willschlitzer commented Jun 29, 2021

Uh oh!

weiji14 Jun 30, 2021

Uh oh!

weiji14 Jun 30, 2021

Uh oh!

willschlitzer commented Jul 19, 2021

Uh oh!

Uh oh!

WIP: Add return_table helper function #1336

WIP: Add return_table helper function #1336

Uh oh!

Conversation

willschlitzer commented Jun 16, 2021

Uh oh!

maxrjones Jun 17, 2021

Choose a reason for hiding this comment

Uh oh!

willschlitzer Jun 17, 2021

Choose a reason for hiding this comment

Uh oh!

willschlitzer Jun 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

weiji14 Jun 30, 2021

Choose a reason for hiding this comment

Uh oh!

willschlitzer commented Jun 22, 2021

Uh oh!

willschlitzer commented Jun 28, 2021

Uh oh!

willschlitzer commented Jun 28, 2021

Uh oh!

willschlitzer commented Jun 29, 2021

Uh oh!

weiji14 Jun 30, 2021

Choose a reason for hiding this comment

Uh oh!

weiji14 Jun 30, 2021

Choose a reason for hiding this comment

Uh oh!

willschlitzer commented Jul 19, 2021

Uh oh!

Uh oh!

willschlitzer Jun 25, 2021 •

edited

Loading