-
-
Notifications
You must be signed in to change notification settings - Fork 6.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libcurl] support uploading large files #19565
base: master
Are you sure you want to change the base?
Conversation
@michelealbano Please help review this PR related to libcurl. |
https://github.com/OpenAPITools/openapi-generator/actions/runs/10803849417/job/29968627388?pr=19565 please update the samples so that the CI can test the change |
@wing328 Samples updated for CI testing. Let me know if anything else is needed. |
curl_mime_data requires storing the file in memory use curl_mime_filename instead of curl_mime_data to support uploading large files.
cc |
Hi, Thank you for PR. We should not modify the binary_t data type for this purpose since the multipart data can have more information in itself. For this, we can define a separate struct for handling multipart data with parameters as given here: https://swagger.io/docs/specification/describing-request-body/multipart-requests/. Could you provide a spec example to understand how a type file is defined where a file path will be given instead of a binary blob? If there is some specific type/format which highlights if it is a binary format or file path, we can may be leverage that. |
Hi, @zhemant Thank you for the input. I believe we should deprecate the current binary_t type, as it copies the file content into memory, which can lead to memory issues when handling large files. Instead, I propose using filename and filepath fields directly to reference the file on disk, avoiding memory overload during file uploads. The data field in the current binary_t structure can be maintained temporarily for compatibility, but we should plan to remove it in a future version. Let me know your thoughts. |
Hi @myml , I would not modify/deprecate binary_t, since it is not used only for files but for other binary types that are not files.
In the above specification, binary_t is with binary string. This includes binary blob or other binary sending, in addition to sending files(what we do now). For simplicity, we could check the first char if it is "@" then we treat it as a path(how curl does it) and call the file handling functions from curl, otherwise we treat it as normal binary. Curl takes the file name from the path to set the respective file name. But that's the best we could do without adding a new data type for the file. Do you think this would work in your scenario? |
Hi @zhemant, Using the first character to determine the type is a bad idea because the first character of binary data could very well be "@" itself, which would lead to unexpected behavior. I currently haven't found a way in the specification to differentiate between binary data and binary files. If we were to add a new data type, I'm not sure how to automatically distinguish between the two when generating code. What are your thoughts on this? |
I think it’s necessary to add a filename field to binary_t, since right now we are hardcoding it as "image.png," which feels more like a temporary solution. |
Yes, I agree, this should not be hardcoded.
There is is possibility of having "@" in binary at 1st place. I have the following thoughts:
I would propose a object like: // name of the data this should be already defined as part of API in the form of a key in KVP
char *name;
// For value, following stuct could be used(this I am referencing from libcurl form data):
typedef struct mime_form_param_t {
// payload data binary or string
char *data;
// file name to be used
char *filename;
// content type to set for this data
char *content_type;
// encoding to use to send data
char *encoder;
// list of additional headers
list_t *headers;
// custom to know if the data is file path and not binary
bool is_filepath;
} mime_form_param_t; I think with above struct, we get the option to know if the data is file path or binary, and additionally we could provide the options which are part of form params. |
curl_mime_data requires storing the file in memory use curl_mime_filename instead of curl_mime_data to support uploading large files.
PR checklist
Commit all changed files.
This is important, as CI jobs will verify all generator outputs of your HEAD commit as it would merge with master.
These must match the expectations made by your contribution.
You may regenerate an individual generator by passing the relevant config(s) as an argument to the script, for example
./bin/generate-samples.sh bin/configs/java*
.IMPORTANT: Do NOT purge/delete any folders/files (e.g. tests) when regenerating the samples as manually written tests may be removed.
master
(upcoming 7.6.0 minor release - breaking changes with fallbacks),8.0.x
(breaking changes without fallbacks)