2.3 字符串

name := 'Bob'
assert name.len == 3       // will print 3
assert name[0] == u8(66) // indexing gives a byte, u8(66) == `B`
assert name[1..3] == 'ob'  // slicing gives a string 'ob'

// escape codes
windows_newline := '\r\n'      // escape special characters like in C
assert windows_newline.len == 2

// arbitrary bytes can be directly specified using `\x##` notation where `#` is
// a hex digit aardvark_str := '\x61ardvark' assert aardvark_str == 'aardvark'
assert '\xc0'[0] == u8(0xc0)

// or using octal escape `\###` notation where `#` is an octal digit
aardvark_str2 := '\141ardvark'
assert aardvark_str2 == 'aardvark'

// Unicode can be specified directly as `\u####` where # is a hex digit
// and will be converted internally to its UTF-8 representation
star_str := '\u2605' // ★
assert star_str == '★'
assert star_str == '\xe2\x98\x85' // UTF-8 can be specified this way too.

2.3.1 字符串定义与用法

V 中，字符串是一个只读的字节序列。所有的 Unicode 字符用 UTF-8 编码。

s := 'hello 🌎' // emoji takes 4 bytes
assert s.len == 10

arr := s.bytes() // convert `string` to `[]u8`
assert arr.len == 10

s2 := arr.bytestr() // convert `[]byte` to `string`
assert s2 == s

字符串变量不可改变。你不能改变元素：

mut s := 'hello 🌎'
s[0] = `H` // not allowed

error: cannot assign to s[i] since V strings are immutable

注意，索引一个字符串会得到一个 byte，既不是 rune 也不是另一个 string。索引与字符串中的字节一一对应，而不是 Unicode 码。如果你想将 byte 转换成 string，在 byte 类型上使用 .ascii_str() 方法：

country := 'Netherlands'
println(country[0]) // Output: 78
println(country[0].ascii_str()) // Output: N

单双引号都可以用来表示字符串。为了代码一致性，vfmt 会将双引号转换成单引号，除非字符串已经包含了单引号字符。

字符串前加上 r 可以得到原始字符串。转义处理（Escape handling）在原始字符串中不会进行：

s := r'hello\nworld' // the `\n` will be preserved as two characters
println(s) // "hello\nworld"

字符串可以很容易地转换成整型：

s := '42'
n := s.int() // 42

// all int literals are supported
assert '0xc3'.int() == 195
assert '0o10'.int() == 8
assert '0b1111_0000_1010'.int() == 3850
assert '-0b1111_0000_1010'.int() == -3850

其他更多高级的 字符串 处理与转换，参考 vlib/strconv 模块。

2.3.2 字符串内插（String interpolation）

基本的内插句法十分简单——在变量前使用 $。变量将会转成字符串并嵌入在句子中：

name := 'Bob'
println('Hello, $name!') // Hello, Bob!

对于域中的变量也可以使用：'age = $user.age'。如果你需要更复杂的表达，使用 ${}：'can register = ${user.age > 13}'。

2.3.3 字符串格式化输出

V 支持与 C 中 printf 相似的格式说明符。f、g、x、o、b 等都可以作为选项来说明输出格式。编译器很注重存储大小，所以没有 hd 或者 llu。

要使用格式说明符，遵循以下格式：

${varname:[flags][width][.precision][type]}

flags: 可以为 0 或以下选项
- - 输出左对齐。
- + 输出右对齐。无须使用，该选项为默认。
- 0 用‘0’作为填充字符而不是默认的‘空格’。
- 注意：V 现在不支持用 ' 或者 # 作为格式标志。
width: 整数，描述打印输出的最小宽度。
precision: 输入变量是浮点时，跟在 . 后面的整数描述精确到小数点后的位数。变量为整型时忽略。
type:
- f、F 浮点数
- e、E 指数形式的浮点数
- g、G 浮点数，浮点表示法表示小值，指数表示法表示大值
- d 10 进制整数
- x 16 进制整数
- o 8 进制整数
- b 2 进制整数
- s 字符串，几乎从不使用

注意：当数字类型需要呈现字母时（如十六进制字符串或无限大等特殊值），小写版本的类型强制使用小写字母，大写版本强制使用大写字母。

注意：绝大多数情况下，最好让类型那一项为空。浮点数会自动以 g 渲染，整数自动以 d 渲染，s 永远是冗余的。只有以下三种情况推荐清晰地标明类型：

格式化字符串在编译时解析，因此指定类型可以帮助检测错误。
格式化字符串默认使用小写字母表示十六进制数字，且在指数中使用小写字母。可以使用大写类型标志强制在十六进制数字与指数中使用大写字母。
格式化字符串是获知整数的十六进制、二进制或八进制字符串的最方便方法。

进一步了解请阅读 Format Placeholder Specification。

x := 123.4567
println('[${x:.2}]') // round to two decimal places => [123.46]
println('[${x:10}]') // right-align with spaces on the left => [   123.457]
println('[${int(x):-10}]') // left-align with spaces on the right => [123       ]
println('[${int(x):010}]') // pad with zeros on the left => [0000000123]
println('[${int(x):b}]') // output as binary => [1111011]
println('[${int(x):o}]') // output as octal => [173]
println('[${int(x):X}]') // output as uppercase hex => [7B]

println('[${10.0000:.2}]') // remove insignificant 0s at the end => [10]
println('[${10.0000:.2f}]') // do show the 0s at the end, even though they do not change the number => [10.00]

2.3.4 字符串操作

name := 'Bob'
bobby := name + 'by' // + is used to concatenate strings
println(bobby) // "Bobby"
mut s := 'hello '
s += 'world' // `+=` is used to append to a string
println(s) // "hello world"

V 中的所有操作都必须保证两边类型一致。你不能将整型拼接到字符串上：

age := 10
println('age = ' + age) // not allowed

error: infix expr: cannot use int (right expression) as string

此时必须先将 age 转换成 string：

age := 11
println('age = ' + age.str())

或者使用字符串内插（推荐该用法）：

age := 12
println('age = $age')

2.3.5 Unicode 字符（Runes）

一个 Rune 表示一个 Unicode 字符，是 u32 的别称。用 `（反引号）来指示：

rocket := `🚀`

Rune 可以通过 .str() 方法转换成 UTF-8 编码的字符串。

rocket := `🚀`
assert rocket.str() == '🚀'

Rune 可以通过 .bytes() 方法转换成 UTF-8 编码的字节。

rocket := `🚀`
assert rocket.bytes() == [u8(0xf0), 0x9f, 0x9a, 0x80]

十六进制、十进制与 Unicode 都可以用来表示 Rune：

assert `\x61` == `a`
assert `\141` == `a`
assert `\u0061` == `a`

// multibyte literals work too
assert `\u2605` == `★`
assert `\u2605`.bytes() == [u8(0xe2), 0x98, 0x85]
assert `\xe2\x98\x85`.bytes() == [u8(0xe2), 0x98, 0x85]
assert `\342\230\205`.bytes() == [u8(0xe2), 0x98, 0x85]

注意：Unicode 字符与字符串使用相同的转义语法，但它们只能包含一个 Unicode 字符。因此，如果代码未指定单个 Unicode 字符，则会在编译时报错。

再次提醒，字符串在被索引时，内容为 bytes 而非 runes:

rocket_string := '🚀'
assert rocket_string[0] != `🚀`
assert 'aloha!'[0] == `a`

字符串可以通过 .runes() 方法转换成 Unicode 字符。

hello := 'Hello World 👋'
hello_runes := hello.runes() // [`H`, `e`, `l`, `l`, `o`, ` `, `W`, `o`, `r`, `l`, `d`, ` `, `👋`]
assert hello_runes.string() == hello

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#005_字符串.md

#005_字符串.md

2.3 字符串

2.3.1 字符串定义与用法

2.3.2 字符串内插（String interpolation）

2.3.3 字符串格式化输出

2.3.4 字符串操作

2.3.5 Unicode 字符（Runes）

Files

#005_字符串.md

Latest commit

History

#005_字符串.md

File metadata and controls

2.3 字符串

2.3.1 字符串定义与用法

2.3.2 字符串内插（String interpolation）

2.3.3 字符串格式化输出

2.3.4 字符串操作

2.3.5 Unicode 字符（Runes）