Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add configuration call #82

Merged
merged 1 commit into from
Nov 12, 2020
Merged

Conversation

rhaas80
Copy link
Contributor

@rhaas80 rhaas80 commented Sep 1, 2020

No description provided.

@adammoody
Copy link
Contributor

Looks good. Thanks, @rhaas80

@tonyhutter
Copy link
Collaborator

I feel that making the user pass a kvtree to AXL_Config() to set config values is needlessly complex. It also forces the user to include KVTree in their app's dependencies.

Why not use simple config strings (AXL_Config(id, "key1=value1 key2=value2" ...)) as discussed in #38? This would allow you to set per-ID config values, and have them persist in the state_file if specified. Also, any AXL_Config() code should allow the user to set the config options as many times as they want. I bring this up because this PR only lets them set it once:

    /* only accept configuration options once */
    configured = 1;

Also, please add a description to your PR and all your commits, and please document what the config values are and what they do.

@rhaas80
Copy link
Contributor Author

rhaas80 commented Sep 8, 2020

SCR does offer a string based configuration interface to allow for easy passing of user option. I saw the VeloC libraries as lower level building blocks that will be called from a higher level library that will to the option handling.

All user codes already depend (implicitly) on kvtree since AXL does.

Having that library then construct a string to pass options seems error prone since it looses any hope of type checking (not that the fact that kvtree currently stores only strings helps there).

Having said all of this: duplicating the string parsing code from SCR in each VeloC component is certainly easy enough if that is desired.

Right now the check is in place to reduce the chance of setting parameters, which AXL currently can assume to be constant during a run, to change in the middle of a run (at least to the amount that this is possible given that the config calls happen after AXL_Init). Can obviously be changed, and if option strings are used pretty much must be changed otherwise one has to construct one gigantic option string.

@rhaas80
Copy link
Contributor Author

rhaas80 commented Sep 8, 2020

Please let me know what the preferred option is and i will be happy to implement it.

@rhaas80
Copy link
Contributor Author

rhaas80 commented Sep 8, 2020

Yes, documentation of these options in the user docs is indeed required. Thanks for pointing that out. In the code right now the various #defines correspond 1:1 to existing AXL global variables that have comments (identical in .c and .h files explaining what they do). I am certainly fine triplicating these comments at he location of the #define.

@tonyhutter
Copy link
Collaborator

I saw the VeloC libraries as lower level building blocks that will be called from a higher level library that will to the option handling.

I think we should develop AXL under the assumption that it's a standalone library that anyone could use, rather than just assuming that FILO and VeloC would use it. I could see UnifyFS using it for file stage-in, for example.

All user codes already depend (implicitly) on kvtree since AXL does.

What I mean is that this PR would require users to #include <kvtree.h> and link against libkvtree if they wanted to call AXL_Config(). They currently don't have to do this, nor would they have to do it with a string-based AXL_Config().

Having that library then construct a string to pass options seems error prone since it looses any hope of type checking (not that the fact that kvtree currently stores only strings helps there).

AXL would know the value type in a key=value string, so it would be able to sanity check it. Like it would check that file_buf_size is passed a numeric value, for example.

Right now the check is in place to reduce the chance of setting parameters, which AXL currently can assume to be constant during a run, to change in the middle of a run (at least to the amount that this is possible given that the config calls happen after AXL_Init). Can obviously be changed, and if option strings are used pretty much must be changed otherwise one has to construct one gigantic option string.

You wouldn't need to construct a giant string. You could set one or more values per AXL_Config() call. For example, if you wanted to set the debug file_buf_size and mkdir config values, you could do this:

AXL_Config(id, "debug=1");
AXL_Config(id, "file_buf_size=65536 mkdir=1")

or this:

AXL_Config(id, "debug=1");
AXL_Config(id, "file_buf_size=65536");
AXL_Config(id, "mkdir=1");

or this:

AXL_Config(id, "debug=1 file_buf_size=65536 mkdir=1")

Or any other combination you want. The result would be the same.

A string-based config is also nice because it reduces the amount of boilerplate code you have to write (which also makes the code easier to read). For example, you could reduce your test_config.c code from:

#include <kvtree.h>
...

     kvtree* axl_config_values = kvtree_new(); 
...

    /* check AXL configuration settings */
    rc = kvtree_util_set_bytecount(axl_config_values,
                                   AXL_KEY_CONFIG_FILE_BUF_SIZE,
                                   old_axl_file_buf_size + 1);
    if (rc != KVTREE_SUCCESS) {
        printf("kvtree_util_set_bytecount failed (error %d)\n", rc);
        return rc;
    }
    rc = kvtree_util_set_int(axl_config_values, AXL_KEY_CONFIG_DEBUG,
                             !old_axl_debug);
    if (rc != KVTREE_SUCCESS) {
        printf("kvtree_util_set_int failed (error %d)\n", rc);
        return rc;
    }
    rc = kvtree_util_set_int(axl_config_values, AXL_KEY_CONFIG_MKDIR,
                             !old_axl_make_directories);
    if (rc != KVTREE_SUCCESS) {
        printf("kvtree_util_set_int failed (error %d)\n", rc);
        return rc;
    }
    rc = kvtree_util_set_int(axl_config_values, AXL_KEY_CONFIG_COPY_METADATA,
                             !old_axl_copy_metadata);
    if (rc != KVTREE_SUCCESS) {
        printf("kvtree_util_set_int failed (error %d)\n", rc);
        return rc;
    }

    rc = AXL_Config(axl_config_values);

    ...
    kvtree_delete(&axl_config_values);

To:

	char *config;
	asprintf(&config, "file_buf_size=%d debug=%d mkdir=%d copy_metadata=%d",
		old_axl_file_buf_size + 1, !old_axl_debug, !old_axl_make_directories, !old_axl_copy_metadata);
	AXL_Config(config);

I would also propose that AXL_Config() be able to set and retrieve both global and per AXL ID specific values. There are a couple of ways to do this:

  1. Pass an AXL ID value to AXL_Config(). If it's a global value, then pass -1 for the AXL ID:
int id = 42;
AXL_Config(id, "file_buf_size=65535");
AXL_Config(-1, "debug=1");
  1. Have an 'id' key value pair for setting ID specific values:
AXL_Config("id=42 file_buf_size=65535");
AXL_Config("debug=1");

And to retrieve:

char *str = AXL_Config("debug");
printf("debug=%s", str);   // prints "debug=1"

char *str = AXL_Config("id=42 file_buf_size");
printf("file_buf_size=%s", str);   // prints "file_buf_size=65535"

@rhaas80
Copy link
Contributor Author

rhaas80 commented Sep 9, 2020

Uff that was a lengthy comment. I guess I will have to redo my implementation.

I think we should develop AXL under the assumption that it's a standalone library that anyone could use, rather than just assuming that FILO and VeloC would use it. I could see UnifyFS using it for file stage-in, for example.

From my point of view (as a science code developer) UnifyFS would still be a layer between AXL and "me" and would (and should do so I would think) hide the fact that it internally uses AXL and translate its own options into whatever AXL expects.

What I mean is that this PR would require users to #include <kvtree.h> and link against libkvtree if they wanted to call AXL_Config(). They currently don't have to do this, nor would they have to do it with a string-based AXL_Config().

Right now one already has to link against kvtree since AXL links against it. One has to explicitly do so since the current CMake file for AXL is not reporting AXL's own dependent libraries as libraries one needs to link against (at least not that I can see and in a similar situation in er with rankstr I certainly had to manually add it). On a sufficiently old system with dynamic linking the linker will automatically link in the dynamic librraries that libAXL.so uses when one links against libAXL.so but for static linking and newer linkers (that default to not setting --copy-dt-needed-entries see https://stackoverflow.com/questions/8725572/gcc-4-5-vs-4-4-linking-with-dependencies) one does need to list them even for a dynamic link.

#include <kvtree.h> is not needed on the user side since axl.h already includes that file. However this requires that include paths are set up so that the code using AXL actually finds kvtree.h which is a downside and could be avoided by forward declaring kvtree ie typedef struct kvtree_struct kvtree;.

Having that library then construct a string to pass options seems error prone since it looses any hope of type checking (not that the fact that kvtree currently stores only strings helps there).

AXL would know the value type in a key=value string, so it would be able to sanity check it. Like it would check that file_buf_size is passed a numeric value, for example.

Let me be more explicit: passing strings pushes type checks to runtime, while using kvtree_set_int gives a warning at runtime. One can get warnings for strings by using something like AXL_Configf that has printf like syntax and using compiler specific attributes to tell the compiler that ACL_Confgf is printf-like.

You wouldn't need to construct a giant string. You could set one or more values per AXL_Config() call. For example, if you wanted to set the debug file_buf_size and mkdir config values, you could do this:

If I look at the discussion in #38 (eg #38 (comment)) then this does not seem to be (sanely) possible since constructs like "CKPT=0 SCHEME=XOR" are not expected to set both CKPT to 0 and SCHEME to XOR but instead set SCHEME to XOR in the 0 sub-option of CHKPT. Admittedly none of the VeloC compoments actually uses anything like this (only SCR does).

A string-based config is also nice because it reduces the amount of boilerplate code you have to write (which also makes the code easier to read). For example, you could reduce your test_config.c code from:

#include <kvtree.h>
...

     kvtree* axl_config_values = kvtree_new(); 
...

    /* check AXL configuration settings */
    rc = kvtree_util_set_bytecount(axl_config_values,
                                   AXL_KEY_CONFIG_FILE_BUF_SIZE,
                                   old_axl_file_buf_size + 1);
    if (rc != KVTREE_SUCCESS) {
        printf("kvtree_util_set_bytecount failed (error %d)\n", rc);
        return rc;
    }
    rc = kvtree_util_set_int(axl_config_values, AXL_KEY_CONFIG_DEBUG,
                             !old_axl_debug);
    if (rc != KVTREE_SUCCESS) {
        printf("kvtree_util_set_int failed (error %d)\n", rc);
        return rc;
    }
    rc = kvtree_util_set_int(axl_config_values, AXL_KEY_CONFIG_MKDIR,
                             !old_axl_make_directories);
    if (rc != KVTREE_SUCCESS) {
        printf("kvtree_util_set_int failed (error %d)\n", rc);
        return rc;
    }
    rc = kvtree_util_set_int(axl_config_values, AXL_KEY_CONFIG_COPY_METADATA,
                             !old_axl_copy_metadata);
    if (rc != KVTREE_SUCCESS) {
        printf("kvtree_util_set_int failed (error %d)\n", rc);
        return rc;
    }

    rc = AXL_Config(axl_config_values);

    ...
    kvtree_delete(&axl_config_values);

To:

	char *config;
	asprintf(&config, "file_buf_size=%d debug=%d mkdir=%d copy_metadata=%d",
		old_axl_file_buf_size + 1, !old_axl_debug, !old_axl_make_directories, !old_axl_copy_metadata);
	AXL_Config(config);

I don't know. This is lacking a free on config. It does not check asprintf for failure. This seems to mostly safe on boilerplate by removing error checks. With full error checks with strings the code and trying to avoid typo errors in the string names of the variables would look like:

char *config;

rc = asprintf(&config, AXL_CONFIG_KEY_FILE_BUF_SIZE"=%lu "AXL_CONFIG_KEY_DEBUG"=%d "AXL_CONFIG_KEY_MKDIR"=%d "AXL_CONFIG_KEY_COPY_METADATA"=%d",
  (unsigned long) old_axl_file_buf_size + 1, !old_axl_debug, !old_axl_make_directories, !old_axl_copy_metadata);
if(rc == -1) {
  printf("asptrinf failed (error %d)\n", rc);
  return 1;
}
rc = AXL_Config(config);
if(rc != AXL_SUCCESS) {
  return 1;
}
free(config);

which is still quite a bit shorter since it combines multiple kvtree_set_XXX calls into a single asprintf.

I would also propose that AXL_Config() be able to set and retrieve both global and per AXL ID specific values. There are a couple of ways to do this:

sure.

I will implement a demo setup and let you have a look.

@tonyhutter
Copy link
Collaborator

Thanks for the info. Since we decided to go with the kvtree-approach in today's meeting, you can ignore my previous comments, and we should probably update #38 to say we're using a KVTree.

As far as per-AXL ID vs global values go, may I suggest we mix both globals and ID specific values in one tree, with the ID specific ones in a subtree, like:

// psudocode: set global debug and some ID specific values
kvtree config = {
	debug = 1
	kvtree {
		id = 42
		file_buf_size = 65536
	}
	kvtree {
		id = 43
		file_buf_size = 2097152
		compression = 1
	}
}

AXL_Config(config)

I think there's even a way you could index the subtree by id if you wanted, but I can't remember off the top of my head.

And for reading config values back, its probably easiest to have a way to return all the config values (global and ID-specific), and not worry about querying individual ones. You could extend AXL_Config() to do that:

/*
 * Get/set AXL configuration values.
 *
 * config:	The new configuration.  Global variables are in top level of
 *              the tree, and per-ID values are subtrees.  If config=NULL,
 *              then return a kvtree with all the configuration values (globals
 *              and all per-ID trees).
 *
 * Return value:	If config != NULL, then return config on success.  If
 *                      config=NULL (you're querying the config) then return
 *                      a new kvtree on success.  Return NULL on any failures.
 */
kvtree * AXL_Config(kvtree *config)

For example:

kvtree *old_config;

old_config = AXL_Config(NULL);
if (!old_config)
	printf("error\n");
// read back our config values into old_config

In fact, you could have return not only the config values, but all kvtree values for a particular ID too (like AXL_KEY_STATUS, AXL_KEY_STATE, ... etc).

@rhaas80
Copy link
Contributor Author

rhaas80 commented Sep 16, 2020

Keeping track of things, @tonyhutter pointed out #82 (comment) with the choice of what to return for setting (the input config) and getting (a full tree of all settings).

src/axl.c Outdated Show resolved Hide resolved
test/test_config.c Outdated Show resolved Hide resolved
test/test_config.c Outdated Show resolved Hide resolved
src/axl.h Outdated Show resolved Hide resolved
src/axl.c Outdated
if (kvtree_util_get_bytecount(config,
AXL_KEY_CONFIG_FILE_BUF_SIZE, &ul) == KVTREE_SUCCESS)
{
axl_file_buf_size = (int) ul;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

axl_file_buf_size is a size_t (which most likely is an unsigned long on 64-bit), so you can remove the check against the int. Or if you're being super-safe, check ul against SIZE_MAX.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I had left the fix in rhaas/keyvalue_get. Cherry picked into rhaas/keyvalue. I'd rather not make assumptions about what size_t is wrt to unsigned long. The smallest type to be guaranteed by C to be 64bit is long long int with long being allowed to be 32bit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I would just make axl_file_buf_size an unsigned long and remove the check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was reluctant to change types of existing variables given that I do now know here they are used. If you think this is safe, fine with me for sure (makes for easier code). Note that right now kvtree_util_get_bytecount and scr_abtoull which SCR (the original code) uses to convert byte count strings to integer variables are inconsistent. The former using unsigned long (which is only guaranteed to be 32bit and is indeed so even on 64bit windows machines) while the latter uses unsigned long long (which is the smallest type guaranteed to be able to hold a 64bit number). So unless those two are changed there will still be a need of checks (though it could / should actually go into kvtree_util_get_bytecount making kvtree's user code much simpler).

@tonyhutter
Copy link
Collaborator

Please check that all indents are four spaces, and also add per-ID config options.

@tonyhutter
Copy link
Collaborator

Please also squash your commits

src/axl.c Outdated
@@ -309,6 +310,92 @@ int AXL_Finalize (void)
return rc;
}


/** Set a AXL config parameters */
kvtree* AXL_Config(const int id, const kvtree* config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would get rid of the id arg, and make the per-ID values subtrees within the main tree. For example:

    kvtree *tree;
    kvtree *id;

    tree = kvtree_new();

    /* Add some config for AXL ID 1 */
    id = kvtree_set_kv_int(tree, "id", 1);
    kvtree_set_kv_int(id, "compression", 1);
    kvtree_set_kv_int(id, "file_buf_size", 65536);

    /* Add some config for AXL ID 42 */
    id = kvtree_set_kv_int(tree, "id", 42);
    kvtree_set_kv_int(id, "file_buf_size", 1048576);

    printf("Whole tree ---\n");
    kvtree_print(tree, 4);

    /* Lookup an id 1's values */
    id = kvtree_get_kv_int(tree, "id", 1);
    printf("Just ID 1 ---\n");
    kvtree_print(id, 4);
Whole tree ---
      id
        42
          file_buf_size
            1048576
        1
          file_buf_size
            65536
          compression
            1
Just ID 1 ---
      file_buf_size
        65536
      compression
        1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, I have a not quite pull request (but a branch) that actually implements per transfer options. No thread locking yet. The diff can be seen here:

rhaas80/AXL@rhaas/keyvalue_get...rhaas80:rhaas/per_transfer

and the branch itself is

https://github.com/rhaas80/AXL/tree/rhaas/per_transfer

and the second last commit (rhaas80@b6a1700) is the one that goes from id being an argument to encoding it in the kvtree. I am not sure I like that solution, but it is certainly doable and the commit nicely shows the differences this implies to taking an int id argument in AXL_Config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in call on 2020-10-07:

  • go with kvtree only argument test
  • combine all 3 pull requests for set, get and per_transfer into single pull request
  • might keep commits separate

@rhaas80 rhaas80 marked this pull request as draft October 14, 2020 20:27
src/axl.h Show resolved Hide resolved
@tonyhutter
Copy link
Collaborator

Double check that your indents are all four spaces.

@rhaas80
Copy link
Contributor Author

rhaas80 commented Oct 19, 2020

Double check that your indents are all four spaces.

Will do.

I believe i asked this before: is there an actual document (or even better, a clang format file and an offical clang version to use) for the desired coding style? Right now SCR uses 2 spaces, AXL spaces 4 and there may well be subtle other differences.

@tonyhutter
Copy link
Collaborator

Yea, the inconsistency is annoying, and no, there's no style docs. We've talked a little about enforcing style in Travis:
#77 (comment) but haven't done it.

Copy link
Collaborator

@tonyhutter tonyhutter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@rhaas80 rhaas80 marked this pull request as ready for review October 22, 2020 01:58
@rhaas80
Copy link
Contributor Author

rhaas80 commented Oct 22, 2020

Based on the discussion in today's call I verified that with AXL at rhaas/keyvalue (ie this pull request) and all other VeloC components on master and SCR on develop, no more tests fail on my workstation than with AXL on master (some tests fail due to they assuming SLURM and / or multiple nodes).

doc/README.md Outdated Show resolved Hide resolved
doc/README.md Outdated Show resolved Hide resolved
@adammoody adammoody merged commit 9ccb313 into ECP-VeloC:master Nov 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants