-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue where query parameters for S3 are double encoded #704
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anyone have a good location to find example encoding tests from AWS itself? Or known good tests in another language? Every time there is an encoding issue I get a PR addressing that specific issue in some partial way and over time param encoding has gotten really messy in ExAws. I'm wondering if there is some more canonical, comprehensive thing we could be using.
@@ -106,6 +105,9 @@ defmodule ExAws.Request.Url do | |||
|
|||
def uri_encode(url), do: URI.encode(url, &valid_path_char?/1) | |||
|
|||
defp query_replace_spaces(nil), do: nil | |||
defp query_replace_spaces(query), do: String.replace(query, "+", "%20") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's %2B
in one place and %20
in another. This seems wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, that whole String.replace(new_path, "+", "%2B")
isn't needed and has no effect. The new_path
gets uri_encoded a little higher up.
In the S3 operation code path, the url
that gets passed to sanize has already had the query params encoded, but the reset of the URL hasn't been. So if new_path
in this case contains a +, then it is actually a + and not a space.
s3.ex:29 calls ExAws.Request.Url.build(config)
which is what encodes the query and not the path.
It feels a little bit inconsistent to encode one earlier and the other later. Maybe it would be better to shift the encoding to one place in the code. And perhaps it's best to encode both the path and the query in Url.sanitize() because ExAws.Operation.{JSON,Query,RestQuery} also call ExAws.Request.Url.build and therefore have their query and not path encoded if they include any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed a quick change taking out that replace. Let me know if you think it'd be a good idea for me to try to move the uri_encoding out of ExAws.Request.Url.build(config)
and to encode both the path and query in ExAws.Request.Url.sanitize instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
query_replace_spaces
appears to actually be replacing +'s rather than spaces .. and I think that could more simply use URI.encode_www_form/1
instead of URI.encode
.
On a related note, the query string parse is redundant with the URI.parse/1
call which also separates out the query string.
It would be nice to see this fix get into the repo, but perhaps with a bit more cleanup?
Yeah good question about the tests. I have a much simpler implementation of the encoding now that encodes both in the build() method. It works for the tests currently in url_test.ex but I'm a little nervous about pushing it as it's not a comprehensive set of tests. I'll see what I can find for more test coverage, but yeah if anyone can point out a good suite to copy that would help. |
Sorry I haven't forgotten about trying to get better test coverage in here. I've just ended up with crunch time at work. I'll hopefully get a chance to add some more over the weekend. I have had a look through a few of the amazon provided libraries in python and ruby to see if there's an obvious set of tests to copy from there and haven't found anything comprehensive yet. I'll keep looking, but if anyone else finds a set of tests that are worth reviewing and perhaps copying and can point me in the right direction I'm happy to add them. I am using this patch in production and it's fixed the original issue I had and the test cases in this PR at least cover that case. i.e. "When using ExAws.S3.list_objects_v2 or ExAws.S3.list_objects and a key that has a space or unicode chars in it, the query params passed to S3 are double encoded." The encoding in the code here can definitely be simplified. |
Heyy, thanks for creating the lib! It's super helpful :) there's any updates on this issue? I'm also facing this problem and I'm currently using this PR as a fix. |
@jotaviobiondo No, no updates. If you'd like to have a stab at adding some broader encoding tests and tidying up the PR, I'd be more than happy to accept an updated version. |
When using
ExAws.S3.list_objects_v2
orExAws.S3.list_objects
and a key that has a space or unicode chars in it, the query params passed to S3 are double encoded.The following command fails because the URL signatures don't match. This is because the query prefix is encoded to replace " " with +. Then the sanitize method replacces + with %2B.
This PR changes the sanitize method so that it only sanitised the path and not the query params again and the output should be as follows. There are tests as part of the PR that show the new behaviour.