optimized textwidth(::Char) for ASCII #55398

stevengj · 2024-08-06T15:39:15Z

For the common case of ASCII characters, I found in https://github.com/JuliaLang/julia/pull/55351/files#r1705746669 that textwidth can be sped up by ~~60%~~ 2.7x or more by adding a fast-path ASCII ~~lookup table~~ hard-coded rule.

The slowdown for non-ASCII characters seems negligible (1% or less). Here, textwidth2 is the new function, and textwidth is the old function (Julia 1.10.4, Apple M1 Pro):

julia> @btime textwidth('x'); @btime textwidth2('x')
  3.167 ns (0 allocations: 0 bytes)
  1.166 ns (0 allocations: 0 bytes)
1

julia> @btime textwidth('α'); @btime textwidth2('α')
  3.208 ns (0 allocations: 0 bytes)
  3.166 ns (0 allocations: 0 bytes)
1

jakobnissen · 2024-08-06T15:56:11Z

Can it be faster if the const lookup table is a Tuple? The reason is that const-ness at the moment only refers to the binding, so a const vector will do a triple load (load the underlying memoryref, then load the pointer, then load the offset) whereas an immutable Tuple might generate better code.

stevengj · 2024-08-06T15:59:55Z

@jakobnissen, good catch, a tuple table indeed seems to be faster:

julia> @btime textwidth('x'); @btime textwidth2('x'); @btime textwidth3('x');
  3.166 ns (0 allocations: 0 bytes)
  1.958 ns (0 allocations: 0 bytes)
  1.125 ns (0 allocations: 0 bytes)

(textwidth is the old version, textwidth2 is the array version, and textwidth3 is the tuple version)

stevengj · 2024-08-06T16:03:23Z

Another possibility would be to omit the table entirely and replace it with a couple of if statements, since the table is quite simple right now:

(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0)

Update it looks like

b < 0x7f && return Int(b >= 0x20);

is about the same speed as the tuple lookup table, maybe 1% faster, and is simpler, so I'll switch to this. (It omits a fast path for the ASCII U+007f "DEL" character, which has width 0, but having that odd character being slightly slower doesn't seem like a big deal to me).

stevengj · 2024-08-06T19:57:00Z

Looks like an unrelated failure in the REPL tests?

base/strings/unicode.jl

Co-authored-by: Jeff Bezanson <[email protected]>

optimized textwidth(::Char) for ASCII

5d64115

stevengj added performance Must go faster unicode Related to unicode characters and encodings labels Aug 6, 2024

stevengj mentioned this pull request Aug 6, 2024

add rtruncate, ltruncate, ctruncate for truncating strings #55351

Merged

stevengj added 2 commits August 6, 2024 11:47

fix AbstractChar vs Char

16b269f

more textwidth tests

68d372c

tuple table is faster

516bfd4

stevengj added 2 commits August 6, 2024 12:11

replace lookup table with check

c67ebbe

more tests

dcad1b0

JeffBezanson reviewed Aug 6, 2024

View reviewed changes

base/strings/unicode.jl Outdated Show resolved Hide resolved

JeffBezanson reviewed Aug 6, 2024

View reviewed changes

base/strings/unicode.jl Outdated Show resolved Hide resolved

JeffBezanson approved these changes Aug 6, 2024

View reviewed changes

stevengj and others added 3 commits August 6, 2024 21:21

Update base/strings/unicode.jl

a5cdf25

Co-authored-by: Jeff Bezanson <[email protected]>

Update base/strings/unicode.jl

024e83d

Co-authored-by: Jeff Bezanson <[email protected]>

Merge branch 'master' into ascii_textwidth

8af04ee

stevengj added the merge me PR is reviewed. Merge when all tests are passing label Aug 7, 2024

IanButterworth merged commit b43e247 into master Aug 7, 2024
8 checks passed

IanButterworth deleted the ascii_textwidth branch August 7, 2024 11:48

oscardssmith removed the merge me PR is reviewed. Merge when all tests are passing label Aug 7, 2024

lazarusA pushed a commit to lazarusA/julia that referenced this pull request Aug 17, 2024

optimized textwidth(::Char) for ASCII (JuliaLang#55398)

1fbf822

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimized textwidth(::Char) for ASCII #55398

optimized textwidth(::Char) for ASCII #55398

stevengj commented Aug 6, 2024 •

edited

Loading

jakobnissen commented Aug 6, 2024

stevengj commented Aug 6, 2024

stevengj commented Aug 6, 2024 •

edited

Loading

stevengj commented Aug 6, 2024

optimized textwidth(::Char) for ASCII #55398

optimized textwidth(::Char) for ASCII #55398

Conversation

stevengj commented Aug 6, 2024 • edited Loading

jakobnissen commented Aug 6, 2024

stevengj commented Aug 6, 2024

stevengj commented Aug 6, 2024 • edited Loading

stevengj commented Aug 6, 2024

stevengj commented Aug 6, 2024 •

edited

Loading

stevengj commented Aug 6, 2024 •

edited

Loading