Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dancer::Handler miscalculates content length for strings containing wide characters #980

Closed
a-adam opened this issue Dec 13, 2013 · 4 comments

Comments

@a-adam
Copy link
Contributor

a-adam commented Dec 13, 2013

the following code yields wrong octet counts if $content contains wide characters, e.g. "\x{100}":

sub render_response {
      [...]
        $response->header( 'Content-Length' => length($content) )
          if !defined $response->header('Content-Length');

... as length() per default operates on character sematics.
suggested fix:

{
    use bytes;
    $response->header( 'Content-Length' => length($content) )
       if !defined $response->header('Content-Length');
}
@a-adam
Copy link
Contributor Author

a-adam commented Jan 20, 2014

hi Yanick, got any time to handle this? (it's a pretty darn serious bug, IMO.)
otherwise, i'll soon have to start working on my Amon2-migration branch... ;-)

@yanick
Copy link
Contributor

yanick commented Jan 21, 2014

Hey @a-adam,

Sadly, my free time these days is measured in planck time units. :-( I'll try to get to it as soon as I can. In the meantime, if you want to help and speed-up the process, you can craft a patch that:

  1. implement the fix.
  2. add tests that exhibit the problem, and gloriously pass when (1) is applied.
  3. alter the documentation, if necessary (here, it's probably not)

As a bonus, in the same stroke you'd carve yourself a place forever in the Dancer history as one if its Unicode heros. :-)

@a-adam
Copy link
Contributor Author

a-adam commented Jan 22, 2014

same here, no time...
#993
although i do think the content length for the wire should never ever be miscalculated, the thing raises a few questions.
it looks like Dancer encodes unicode ("wide") characters to UTF-8 by default (though i am by no means sure of that ;-)
-- which leads to the question: is it possible to return binary data, without them being mangled?
furthermore, the whole thing leaves me with an eerie sense that you should probably make a server "charset" setting mandatory, or have a hard-coded default; the submitted tests trigger some "wide character in print" warnings in HTTP::Server::Simple::PSGI -- in my setup, anyway.

@yanick
Copy link
Contributor

yanick commented Mar 20, 2014

Merged the fix, thanks!

And yes, Dancer has a tangled utf8 management. That's something we should sit down and try to fix, one of these days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants