Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: probe content type of files using different strategy (i.e. inspect first bytes) #18949

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

HoneyryderChuck
Copy link
Contributor

@HoneyryderChuck HoneyryderChuck commented Jun 17, 2024

current implementations are relying on file name/extension, which is prone to errors (.mp4, is it a video or a picture?) and spoofing. This change allows customers to use or opt into mechanisms which inspect the first bytes of a file's payload.

I'd like to do the same change for other http clients we use, but I'd like to feedback to validate the viability of this approach first.

PR checklist

  • Read the contribution guidelines.
  • Pull Request title clearly describes the work in the pull request and Pull Request description provides details about how to validate the work. Missing information here may result in delayed response from the community.
  • Run the following to build the project and update samples:
    ./mvnw clean package 
    ./bin/generate-samples.sh ./bin/configs/*.yaml
    ./bin/utils/export_docs_generators.sh
    
    (For Windows users, please run the script in Git BASH)
    Commit all changed files.
    This is important, as CI jobs will verify all generator outputs of your HEAD commit as it would merge with master.
    These must match the expectations made by your contribution.
    You may regenerate an individual generator by passing the relevant config(s) as an argument to the script, for example ./bin/generate-samples.sh bin/configs/java*.
    IMPORTANT: Do NOT purge/delete any folders/files (e.g. tests) when regenerating the samples as manually written tests may be removed.
  • File the PR against the correct branch: master (upcoming 7.6.0 minor release - breaking changes with fallbacks), 8.0.x (breaking changes without fallbacks)
  • If your PR is targeting a particular programming language, @mention the technical committee members, so they are more likely to review the pull request.

@HoneyryderChuck HoneyryderChuck changed the title probe content type of files using different strategy (i.e. inspect first bytes) Draft: probe content type of files using different strategy (i.e. inspect first bytes) Jun 17, 2024
@HoneyryderChuck
Copy link
Contributor Author

cc @martin-mfg @lwlee2608

@@ -1643,21 +1643,25 @@ public class ApiClient {
* @return The guessed Content-Type
*/
public String guessContentTypeFromFile(File file) {
String contentType = URLConnection.guessContentTypeFromName(file.getName());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR

can you please add some tests in https://github.com/OpenAPITools/openapi-generator/blob/5a18e9897bd8b2b4a5013a41bccaf93710dd3883/samples/client/petstore/java/okhttp-gson/src/test/java/org/openapitools/client/ApiClientTest.java to test guessContentTypeFromFile with different file types to confirm the enhancement is good?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added them, but it seems that the default implementation for FileTypeDetector relies on file extension (commented test).

I'm not very knowledgeable on java and the description on how to change the implementation, so I'd need some help from someone who is in order to change the implementation to accomodate the commented test, but I'd say that, as long as we use this API, users can at least do it.

current implementations are relying on file name/extension, which is prone to errors (.mp4, is it a video or a picture?) and spoofing. This change allows customers to use or opt into mechanisms which inspect the first bytes of a file's payload
@HoneyryderChuck HoneyryderChuck force-pushed the reliable-media-detection branch from 5a18e98 to 5820375 Compare June 19, 2024 16:29
@wing328
Copy link
Member

wing328 commented Jun 30, 2024

String contentType = URLConnection.guessContentTypeFromName(file.getName());
if (contentType == null) {
try {
String contentType = Files.probeContentType(file.toPath());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we use URLConnection.guessContentTypeFromStream instead as suggested in https://stackoverflow.com/a/19712111/677735?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants