Skip to content

Commit 546a801

Browse files
randyzwitchfredrikekre
authored andcommitted
Documentation: utf8proc.jl tests (#24032)
* utf8proc.jl tests - normalize_string a bit weak, looked for a test suite quickly but didn’t find - graphemes seems to return iterators of letters, not sure how to test * add spaces, move a misplaced triple backtick * Move examples to correct method Move example to upper method, then make additional examples for isvalid method where type specified as first argument
1 parent c61aa27 commit 546a801

File tree

1 file changed

+81
-0
lines changed

1 file changed

+81
-0
lines changed

base/strings/utf8proc.jl

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,15 @@ export normalize_string, graphemes, is_assigned_char, textwidth, isvalid,
1919
2020
Returns `true` if the given value is valid for its type, which currently can be either
2121
`Char` or `String`.
22+
23+
# Examples
24+
```jldoctest
25+
julia> isvalid(Char(0xd800))
26+
false
27+
28+
julia> isvalid(Char(0xd799))
29+
true
30+
```
2231
"""
2332
isvalid(value)
2433

@@ -28,6 +37,15 @@ isvalid(value)
2837
Returns `true` if the given value is valid for that type. Types currently can
2938
be either `Char` or `String`. Values for `Char` can be of type `Char` or [`UInt32`](@ref).
3039
Values for `String` can be of that type, or `Vector{UInt8}`.
40+
41+
# Examples
42+
```jldoctest
43+
julia> isvalid(Char, 0xd800)
44+
false
45+
46+
julia> isvalid(Char, 0xd799)
47+
true
48+
```
3149
"""
3250
isvalid(T,value)
3351

@@ -195,6 +213,18 @@ options (which all default to `false` except for `compose`) are specified:
195213
* `stable=true`: enforce Unicode Versioning Stability
196214
197215
For example, NFKC corresponds to the options `compose=true, compat=true, stable=true`.
216+
217+
# Examples
218+
```jldoctest
219+
julia> "μ" == normalize_string("µ", compat=true) #LHS: Unicode U+03bc, RHS: Unicode U+00b5
220+
true
221+
222+
julia> normalize_string("JuLiA", casefold=true)
223+
"julia"
224+
225+
julia> normalize_string("JúLiA", stripmark=true)
226+
"JuLiA"
227+
```
198228
"""
199229
function normalize_string(s::AbstractString, nf::Symbol)
200230
utf8proc_map(s, nf == :NFC ? (UTF8PROC_STABLE | UTF8PROC_COMPOSE) :
@@ -255,6 +285,15 @@ category_string(c) = category_strings[category_code(c)+1]
255285
is_assigned_char(c) -> Bool
256286
257287
Returns `true` if the given char or integer is an assigned Unicode code point.
288+
289+
# Examples
290+
```jldoctest
291+
julia> is_assigned_char(101)
292+
true
293+
294+
julia> is_assigned_char('\x01')
295+
true
296+
```
258297
"""
259298
is_assigned_char(c) = category_code(c) != UTF8PROC_CATEGORY_CN
260299

@@ -400,6 +439,15 @@ end
400439
401440
Tests whether a character is a control character.
402441
Control characters are the non-printing characters of the Latin-1 subset of Unicode.
442+
443+
# Examples
444+
```jldoctest
445+
julia> iscntrl('\x01')
446+
true
447+
448+
julia> iscntrl('a')
449+
false
450+
```
403451
"""
404452
iscntrl(c::Char) = (c <= Char(0x1f) || Char(0x7f) <= c <= Char(0x9f))
405453

@@ -431,13 +479,37 @@ ispunct(c::Char) = (UTF8PROC_CATEGORY_PC <= category_code(c) <= UTF8PROC_CATEGOR
431479
Tests whether a character is any whitespace character. Includes ASCII characters '\\t',
432480
'\\n', '\\v', '\\f', '\\r', and ' ', Latin-1 character U+0085, and characters in Unicode
433481
category Zs.
482+
483+
# Examples
484+
```jldoctest
485+
julia> isspace('\n')
486+
true
487+
488+
julia> isspace('\r')
489+
true
490+
491+
julia> isspace(' ')
492+
true
493+
494+
julia> isspace('\x20')
495+
true
496+
```
434497
"""
435498
@inline isspace(c::Char) = c == ' ' || '\t' <= c <='\r' || c == '\u85' || '\ua0' <= c && category_code(c) == UTF8PROC_CATEGORY_ZS
436499

437500
"""
438501
isprint(c::Char) -> Bool
439502
440503
Tests whether a character is printable, including spaces, but not a control character.
504+
505+
# Examples
506+
```jldoctest
507+
julia> isprint('\x01')
508+
false
509+
510+
julia> isprint('A')
511+
true
512+
```
441513
"""
442514
isprint(c::Char) = (UTF8PROC_CATEGORY_LU <= category_code(c) <= UTF8PROC_CATEGORY_ZS)
443515

@@ -449,6 +521,15 @@ isprint(c::Char) = (UTF8PROC_CATEGORY_LU <= category_code(c) <= UTF8PROC_CATEGOR
449521
Tests whether a character is printable, and not a space.
450522
Any character that would cause a printer to use ink should be
451523
classified with `isgraph(c)==true`.
524+
525+
# Examples
526+
```jldoctest
527+
julia> isgraph('\x01')
528+
false
529+
530+
julia> isgraph('A')
531+
true
532+
```
452533
"""
453534
isgraph(c::Char) = (UTF8PROC_CATEGORY_LU <= category_code(c) <= UTF8PROC_CATEGORY_SO)
454535

0 commit comments

Comments
 (0)