Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor all wrappers to pass an argument list to Session.call_module #3132

Merged
merged 8 commits into from
Apr 18, 2024

Conversation

seisman
Copy link
Member

@seisman seisman commented Mar 21, 2024

TL;DR

When calling a GMT module, we pass a single text string of CLI arguments (e.g., "-R0/10/0/10 -JX10c -BWSen+tMy Title") to the C API. It has caused many troubles if an argument contains whitespaces (e.g., #247, #655, #2298 and more) or single/double quotation marks (e.g., #3104 and more), and we have to introduce some hacking solutions (e.g., #1487).

Instead of passing a single text string, we can pass a list of argument string instead (e.g., ["-R0/10/0/10", "-JX10c", "-BWSen+tMy Title"]. In this way, we no longer need any hack solutions and everything just works as expected!

Technical backgrounds

Currently, the Session.call_module() method have two parameters: module is name of a GMT module, and args is the text string of CLI arguments. For example, we usually call the method like this way:

lib.call_module(module="basemap", args=build_arg_string(kwargs))

in which build_arg_string converts a keyword dictionary into a single text string for CLI arguments (e.g., "-R0/10/0/10 -JX10c -BWSen+tMy Title").

Actually, the C API function GMT_Call_Module allows passing CLI arguments in three different modes:

  1. Pass a single text string for CLI arguments "-R0/10/0/10 -JX10c -BWSen+tMy Title"
  2. Pass a list of strings and each string is an option ["-R0/10/0/10", "-JX10c", "-BWSen+tMy Title"]
  3. Pass a list of GMT_OPTION structure

Currently, we use mode 1. As mentioned above, it has caused many trouble if an argument has whitespace and single/double quotation marks. GMT_Call_Module's mode parameter is actually passed to the n_args_in parameter in another C API function GMT_Create_Options. Here is the description of the GMT_Create_Options function:

	/* This function will loop over the n_args_in supplied command line arguments (in) and
	 * then creates a linked list of GMT_OPTION structures for each program option.
	 * These will in turn will be processed by the module-specific option parsers.
	 * What actually happens is controlled by n_args_in.  There are three cases:
	 * n_args_in < 0 : This means that in is already a linked list and we just duplicate and return a copy.
	 * n_args_in == 0: in is a single text string with multiple options (e.g., "-R0/2/0/5 -Jx1 -O -K > file")
	 *		   and we must first break this command string into separate words.
	 * n_args_in > 0 : in is an array of text strings (e.g., argv[]).
	 *
	 * Note: 1. If argv[0] is the calling program name, make sure to pass argc-1, args+1 instead.
	 *	 2. in == NULL is allowed only for n_args_in <= 0 (an empty list of options).
	 *
	 * The linked list returned by GMT_Create_Options should be freed by GMT_Destroy_Options().
	 */

So, if mode 1 is used, the GMT_Create_Options function first splits the single text string into an array of strings (i.e., mode 2), and then converts the array of strings to a list of GMT_OPTION structure (i.e., mode 3). As can be seen in the GMT_Create_Options source codes, for mode 1, GMT tries very hard to deal with whitespace and quotation marks and we're repeating the same work in the build_arg_string function. Mode 1 is definitely the "hardest" mode for us and mode 3 means we have to wrap the GMT_OPTION structure, so why not just use mode 2!

Changes in this PR

  • Let Session.call_module accept a list of strings (i.e., mode 2). Passing a single string is still supported for backward-compatibility Session.call_module: Support passing a list of argument strings #3139
  • Provide a new function build_arg_list function which does the same thing as build_arg_string but returns a list. The build_arg_list doesn't need any hacking solutions as in build_arg_string. Add function build_arg_list for building arguments list from keyword dictionaries #3149
  • Replace build_arg_string with build_arg_list in all module wrappers
  • Fix any edge cases that are not covered by the step above
  • Update all tests that use the Session.call_module method
  • Some tests may no longer make sense and should be removed
  • Need to add more tests
  • If data=["file1.txt", "file2.txt"], data_kind(data) returns matrix, which cause failures. I guess we need to refactor the data_kind function.

To Be Done/Decided

  • Raise a deprecation warning when passing a single string to Session.call_module and fully remove the support in v0.20.0 or even v1.0? Likely no, because sometimes passing a single string is convenient but we should avoid using it in the PyGMT source codes.
  • We need to keep the build_arg_string function for some time because we know some users write their own wrappers and use the build_arg_string function. A deprecation warning makes sense.

Benifits

  • No hacking solutions needed
  • Natively support for passing multiple files (e.g., Figure.plot(data=["file1.txt", "file2.txt"])

Supersede PRs #1487, #1116, #2331, #2726.

Also fixes issue #3104 without the solution in #3105, but I'll open some more discussions about quotation marks, so this PR won't close issue #3104.

@seisman seisman force-pushed the pass-arg-list-to-module branch 4 times, most recently from 82d7035 to b52469f Compare March 21, 2024 14:15
@seisman seisman changed the title Session.call_module: Pass a list of argument strings when call GMT modules POC: Session.call_module: Pass a list of argument strings when call GMT modules Mar 21, 2024
@seisman seisman added the enhancement Improving an existing feature label Mar 21, 2024
@seisman seisman force-pushed the pass-arg-list-to-module branch 2 times, most recently from 14e6ccc to 9fb52d6 Compare March 22, 2024 07:21
@weiji14
Copy link
Member

weiji14 commented Mar 22, 2024

Seriously? After all these years and countless workarounds, and there was a simpler way?!!

@seisman
Copy link
Member Author

seisman commented Mar 22, 2024

Seriously? After all these years and countless workarounds, and there was a simpler way?!!

😄 Should have read the C source codes earlier.

@weiji14 weiji14 changed the title POC: Session.call_module: Pass a list of argument strings when call GMT modules POC: Session.call_module: Pass a list of arguments when calling GMT modules Mar 22, 2024
pygmt/clib/session.py Outdated Show resolved Hide resolved
@seisman seisman changed the base branch from main to call-module-arglist March 27, 2024 13:36
Base automatically changed from call-module-arglist to main March 31, 2024 05:34
@seisman seisman force-pushed the pass-arg-list-to-module branch 4 times, most recently from 234cdf1 to 816cafd Compare March 31, 2024 06:13
@seisman seisman changed the base branch from main to build_arg_list March 31, 2024 06:22
@seisman seisman changed the base branch from build_arg_list to main March 31, 2024 06:23
@seisman

This comment was marked as outdated.

@seisman seisman changed the title POC: Session.call_module: Pass a list of arguments when calling GMT modules Refactor all wrappers to pass an argument list to Session.call_module Apr 8, 2024
@seisman seisman added this to the 0.12.0 milestone Apr 8, 2024
@seisman seisman marked this pull request as ready for review April 8, 2024 03:26
@seisman seisman added the needs review This PR has higher priority and needs review. label Apr 8, 2024
@seisman
Copy link
Member Author

seisman commented Apr 8, 2024

After PR #3139 and #3149, now it's time to refactor module wrappers to pass a list of arguments when calling modules. More than 60 files are changed in this PR but in most files, the changes are simple (just replacing build_arg_string with build_arg_list). There are a few exceptions and please pay more attention to these files:

  • pygmt/figure.py
  • pygmt/helpers/utils.py
  • pygmt/src/config.py
  • pygmt/src/legend.py
  • pygmt/src/text.py
  • pygmt/src/which.py

@seisman
Copy link
Member Author

seisman commented Apr 16, 2024

/format

@seisman seisman requested a review from a team April 16, 2024 14:14
@seisman seisman added final review call This PR requires final review and approval from a second reviewer and removed needs review This PR has higher priority and needs review. labels Apr 18, 2024
@seisman
Copy link
Member Author

seisman commented Apr 18, 2024

Ping @GenericMappingTools/pygmt-maintainers for final review call. Will merge in 48 hours if no further comments.

Copy link
Member

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!



@fmt_docstring
@use_alias(G="download", V="verbose")
@kwargs_to_strings(fname="sequence_space")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need to do it in this PR, but should we remove sequence_space from decorators.py at some point?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improving an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants