Skip to content

Commit

Permalink
Limit codepoint caching to max 3-byte sequences
Browse files Browse the repository at this point in the history
This keeps the benefit of avoiding sprintf on common codepoints without
ballooning the lookup table too much.
  • Loading branch information
avit committed Nov 14, 2020
1 parent 0eb472e commit 7781530
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion lib/rack/utf8_sanitizer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -250,9 +250,11 @@ def unescape_unreserved(input)
# optimized from URI::RFC2396_Parser#escape
def escape_unreserved(input)
@unsafe_map ||= Hash.new do |table, us|
table[us] = us.each_byte.reduce('') do |tmp, uc|
encoded = us.each_byte.reduce('') do |tmp, uc|
tmp << sprintf('%%%02X', uc)
end
table[us] = encoded if us.bytesize <= 3
encoded
end
input.gsub(UNSAFE, @unsafe_map).force_encoding(Encoding::US_ASCII)
end
Expand Down

0 comments on commit 7781530

Please sign in to comment.