-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvlist: add _s APIs to pass str length as arg #36
Conversation
Signed-off-by: Takahiro Yamashita <[email protected]>
Signed-off-by: Takahiro Yamashita <[email protected]>
Could you please explain what's the rationale behind this? Do we need to use non null terminated binary chunks of data anywhere? I can see how a decoded msgpack string would not be null terminated but I want to evaluate the pros / cons of the change. |
Pros
1. Memory safetyCurrent APIs can cause out of bounds memory access. New APIs prevents it since we can pass a length of string. 2. Handling non-null-terminated stringWe can set non-null-terminated string as a key of kvlist since we can pass a length of the string. As you know, msgpack-c string is not null-terminated. 3. Reducing strlen costCurrent API creates cfl_sds_t using ConsIt is redundant if we pass const string. In such case, it is better to use current API. |
The real question is how much sense it makes to pass that complexity cost along to the client code and that depends on what the client code is which is what I want to have a clear picture on because in most cases I have observed that the pattern was misunderstood and blindly followed which caused the client code to be unnecessarily bulkier. Is there any clarity on how much code is performing memory copy operations for this? I don't have anything against having this as an option and we might need it in the future but I don't want to mindlessly add it and cause confusion. Even though it's true that passing a non null terminated buffer to a function that expects one could cause an information leak or even an invalid memory access if it's close enough to the end of a mapping that has no adjacent mappings that is not the real answer. The real answer is to be mindful and understand the constraints and data types the functions you are working with expect. |
https://wiki.sei.cmu.edu/confluence/display/c/STR07-C.+Use+the+bounds-checking+interfaces+for+string+manipulation I'm concerned about the proliferation of the use of something that shouldn't be used. Memory copy operations of inserting
Current inserting API will increase 16 or 48 bytes memory copy operations.
1. cfl_kvlist_insert_int64Current insert functions will be changed like this int cfl_kvlist_insert_int64(struct cfl_kvlist *list,
char *key, int64_t value)
{
return cfl_kvlist_insert_int64_s(list, key, strlen(key), value);
} Current API calls new API with 2. cfl_kvlist_insert_int64_sThe difference of new APIs is calling 3. cfl_kvlist_insert_s
Memory copy operations of fetchingCurrent fetching API will increase 8 or 32 bytes memory copy operations.
1. cfl_kvlist_fetchCurrent API will be changed like this struct cfl_variant *cfl_kvlist_fetch(struct cfl_kvlist *list, char *key)
{
return cfl_kvlist_fetch_s(list, key, strlen(key));
} Current API calls new API with 2. cfl_kvlist_fetch_sThe new API calls My patch removes |
I'm not worried about the additional shadow stack space used, what I want to know is if there is existing client code that has to manually copy and terminate non null terminated strings because these functions do not exist. What I don't want is to confuse people into thinking that they need to use the |
So why does cfl_kv support struct cfl_kv *cfl_kv_item_create_len(struct cfl_list *list,
char *k_buf, size_t k_len,
char *v_buf, size_t v_len) It is already using at flb_signv4. cfl_sds also supports How about |
That seems to be because it's using That's not connected to the cfl variant or any associated types. In the cfl variant family strings are null terminated (because we use UTF-8) and binary data storage requires the length to be specified. It's not that I have something against this in particular but I would like have a very clear picture of why do we need or want this. |
I want to refactor filter_record_modifier by using new APIs. Currently filter_record_modifier defines own kv store. It is used to store properties of filter_record_modifier. It also is used to check record using msgpack string. I believe these API is useful for such cases and improving maintainability by using cfl APIs. |
For your use case you are better off using |
Thank you for comment. |
The reason why I'd stay away from variants in most cases is they are lacking in ergonomics and I didn't really "design" them for general use at the time, it was a reaction to a particular need (otlp attributes). Anyway, as you mentioned they were adopted for other uses so we need to improve their ergonomics, the quality (I particularly dislike the kvlist implementation) and maybe extend them (do we want a cfl_variant aware / native hash map or list?) Maybe those are some questions we could us to start a wider discussion in order to turn variants into first class citizens but that doesn't have anything to do with this PR. I have already approved it so we don't get stuck in a pointless argument. |
Signed-off-by: Eduardo Silva <[email protected]>
Signed-off-by: Eduardo Silva <[email protected]>
Getting merged this to get the functionality, I will check in another PR potential fixes for CI tests |
This patch is to add cfl_kvlist_insert_*_s APIs and tests.
These API is to pass a length of key / (value for insert_string).
It is useful to pass non-NULL-terminated string like
msgpack_object_str
This patch also adds
cfl_variant_create_from_string_s
.It is needed by
cfl_kvlist_insert_string_s
to create value.Valgrind output