Skip to content

Commit 367a2da

Browse files
committed
small fixes
1 parent f3eac7e commit 367a2da

20 files changed

+797
-81
lines changed

aoc2023-first-days.html

+9-9
Original file line numberDiff line numberDiff line change
@@ -16,12 +16,12 @@ <h2><a href="/">naming is hard</a></h2>
1616
</div>
1717
<div class="article">
1818
<section id="Zero-allocation-hello-world-in-Rust">
19-
<h1 date="2023/12/02" hide="true">Zero allocation hello world in Rust</h1>
20-
<p>This year of <a href="https://adventofcode.com">Advent of Code</a> I decided to try <a href="https://www.rust-lang.org/"><strong>Rust</strong></a>. I&rsquo;m complete newbie and still learning basic concepts of language (btw, <a href="https://rust-book.cs.brown.edu">Brown Book</a> is amazing), but there were one idea that I wanted to try for the AoC challenges - I need to implement <strong>zero allocation</strong> solutions (at least for the first ones)!</p>
19+
<a class="heading-anchor" href="#Zero-allocation-hello-world-in-Rust"><h1 date="2023/12/02" hide="true">Zero allocation hello world in Rust</h1>
20+
</a><p>This year of <a href="https://adventofcode.com">Advent of Code</a> I decided to try <a href="https://www.rust-lang.org/"><strong>Rust</strong></a>. I&rsquo;m complete newbie and still learning basic concepts of language (btw, <a href="https://rust-book.cs.brown.edu">Brown Book</a> is amazing), but there were one idea that I wanted to try for the AoC challenges - I need to implement <strong>zero allocation</strong> solutions (at least for the first ones)!</p>
2121
<p>What does it mean, <strong>zero allocation</strong>? Rust Book has nice <a href="https://rust-book.cs.brown.edu/ch04-01-what-is-ownership.html">chapter about ownership</a> which describes such concepts like memory, <em>stack</em> and <em>heap</em>. In short, your program usually operating with memory from two regions:</p>
2222
<p>&hellip;something about zero allocations&hellip;</p>
2323
<p>How can we analyze allocations of our program? I found nice tool <a href="https://github.com/matt-kimball/allocscope">allocscope</a> which record all allocations made with <code>malloc</code> in your program and allow you to analyze source of that allocations. Also, you can compile simple C code in shared library in order to override default <code>malloc</code> and add some debug information to it:</p>
24-
<pre><code><span class="macro">#define</span> _<span class="identifier">GNU_SOURCE</span> <span class="number">1</span></code>
24+
<pre class="language-c"><code><span class="macro">#define</span> _<span class="identifier">GNU_SOURCE</span> <span class="number">1</span></code>
2525
<code><span class="macro">#include</span> <span class="string">"stdlib.h"</span></code>
2626
<code><span class="macro">#include</span> <span class="string">"stdio.h"</span></code>
2727
<code><span class="macro">#include</span> <span class="string">"dlfcn.h"</span></code>
@@ -32,9 +32,9 @@ <h1 date="2023/12/02" hide="true">Zero allocation hello world in Rust</h1>
3232
<code>}</code>
3333
</pre>
3434
<p>Let&rsquo;s look at all <code>malloc</code> allocations in simple hello world program:</p>
35-
<pre><code><span class="keyword">fn</span> <span class="function">main</span>() { <span class="identifier">println</span>!(<span class="string">"Hello, world!"</span>); }</code>
35+
<pre class="language-rust"><code><span class="keyword">fn</span> <span class="function">main</span>() { <span class="identifier">println</span>!(<span class="string">"Hello, world!"</span>); }</code>
3636
</pre>
37-
<pre><code><span class="command">$> rustc main.rs</span></code>
37+
<pre class="language-shell"><code><span class="command">$> rustc main.rs</span></code>
3838
<code><span class="command">$> LD_PRELOAD=./libm.so ./main</span></code>
3939
<code>malloc: 472</code>
4040
<code>malloc: 120</code>
@@ -48,7 +48,7 @@ <h1 date="2023/12/02" hide="true">Zero allocation hello world in Rust</h1>
4848
<p>Couple of them looks pretty suspicious: 1024 bytes allocations is almost surely used for some intermediate buffers. We are printing string to the console - so most likely that Rust implementation of writes to <code>stdout</code> uses buffering for performance.</p>
4949
<p>If we will unwind all macros we should get some code equivalent to the <code>write_all</code> call on <code>stdout()</code> stream: <code>io::stdout().write_all(b"Hello, World!")</code>.</p>
5050
<p>We can look up for the code of <code>io</code> module and indeed see, that <a href="https://doc.rust-lang.org/src/std/io/stdio.rs.html#614"><code>stdout()</code></a> creates synchronized instance wrapped with <a href="https://doc.rust-lang.org/src/std/io/buffered/linewriter.rs.html#87"><code>LineWriter</code></a> which has default buffer size of 1KiB.</p>
51-
<pre><code>#[<span class="identifier">must_use</span>]</code>
51+
<pre class="language-rust"><code>#[<span class="identifier">must_use</span>]</code>
5252
<code>#[<span class="function">stable</span>(<span class="identifier">feature</span> = <span class="string">"rust1"</span>, <span class="identifier">since</span> = <span class="string">"1.0.0"</span>)]</code>
5353
<code><span class="keyword">pub</span> <span class="keyword">fn</span> <span class="function">stdout</span>() -> <span class="identifier">Stdout</span> {</code>
5454
<code> <span class="identifier">Stdout</span> {</code>
@@ -61,14 +61,14 @@ <h1 date="2023/12/02" hide="true">Zero allocation hello world in Rust</h1>
6161
<code> #[<span class="function">stable</span>(<span class="identifier">feature</span> = <span class="string">"rust1"</span>, <span class="identifier">since</span> = <span class="string">"1.0.0"</span>)]</code>
6262
<code> <span class="keyword">pub</span> <span class="keyword">fn</span> <span class="function">new</span>(<span class="identifier">inner</span>: <span class="identifier">W</span>) -> <span class="identifier">LineWriter</span><<span class="identifier">W</span>> {</code>
6363
<code> <span class="comment">// Lines typically aren't that long, don't use a giant buffer</span></code>
64-
<code> <span class="identifier">LineWriter</span>::<span class="function">with_capacity</span>(<span class="number">1024,</span> <span class="identifier">inner</span>)</code>
64+
<code> <span class="identifier">LineWriter</span>::<span class="function">with_capacity</span>(<span class="number">1024</span>, <span class="identifier">inner</span>)</code>
6565
<code> }</code>
6666
<code>}</code>
6767
</pre>
6868
<p>Ok, let&rsquo;s get rid of the <code>stdout</code> then and use <code>stderr</code> which also creates synchronized instance but without any additional buffering. We can write to <code>stderr</code> explicitly or use <code>eprintln!</code> macro for the same purpose. Let&rsquo;s see how much memory we allocate in this case in our new program:</p>
69-
<pre><code><span class="keyword">fn</span> <span class="function">main</span>() { <span class="identifier">eprintln</span>!(<span class="string">"Hello, world!"</span>); }</code>
69+
<pre class="language-rust"><code><span class="keyword">fn</span> <span class="function">main</span>() { <span class="identifier">eprintln</span>!(<span class="string">"Hello, world!"</span>); }</code>
7070
</pre>
71-
<pre><code><span class="command">$> rustc main.rs</span></code>
71+
<pre class="language-shell"><code><span class="command">$> rustc main.rs</span></code>
7272
<code><span class="command">$> LD_PRELOAD=./libm.so ./main</span></code>
7373
<code>malloc: 472</code>
7474
<code>malloc: 120</code>

caching-is-hard.html

+2-2
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,8 @@ <h2><a href="/">naming is hard</a></h2>
1616
</div>
1717
<div class="article">
1818
<section id="Caching-is-hard">
19-
<h1 date="2023/11/01" hide="true">Caching is hard</h1>
20-
<blockquote>
19+
<a class="heading-anchor" href="#Caching-is-hard"><h1 date="2023/11/01" hide="true">Caching is hard</h1>
20+
</a><blockquote>
2121
<p>There are only two hard things in computer science: cache invalidation and naming things</p>
2222
<p><strong>Phil Karlton</strong></p>
2323
</blockquote>

compression-kit.dj

+190
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
{date="2024/02/03" hide="true"}
2+
# Compression kit
3+
4+
- Varint encode example
5+
6+
``` ckit
7+
zeros = repeat(bit, 0)
8+
// we will generate two methods: varint_u32.encode and varint_u32.decode
9+
fn varint_u32.encode(x: [32]bit) ~ varint_u32.decode(bytes: [?: 1..=5]u8) {
10+
scan x ~ bytes {
11+
result.push(byte) // ~ byte = result.pop()
12+
// we can reconstruct computation because suffix is constant
13+
if x[7..].prefix_of(zeros) ~ byte & 0x80 == 0 {
14+
byte = x.pop(7) // ~ x.push(byte[0..7])
15+
x.pop(0..) // ~ x.push(zeroes[0..]), instead of break explicitly skip rest of the x
16+
} else {
17+
byte = x.pop(7) | 0x80
18+
}
19+
}
20+
}
21+
```
22+
23+
- Shrink encode example
24+
25+
``` ckit
26+
zeros = repeat(bit, 0)
27+
fn shrink_u32.encode(x: [32]bit) ~ shrink_u32.decode(bytes: [n: 1..=5]u8) {
28+
scan x ~ bytes {
29+
result.push(byte) // ~ byte = result.pop()
30+
byte = x.pop(8) // ~ x.push(byte)
31+
if x[0..].prefix_of(zeros) ~ result.empty() {
32+
x.pop(0..) // ~ x.push(zeroes[0..])
33+
}
34+
}
35+
}
36+
```
37+
38+
- Length prefixing
39+
40+
``` ckit
41+
fn length_prefix.encode(b: [n: 0..1<<32]u8) ~ length_prefix.decode(bytes: [?: n+1..=n+5]u8) {
42+
bytes.write(varint_length) // ~ 1. varint_length: [?]u8 = result.read()
43+
varint_length = varint_u32(n) // ~ 2. n = varint_u32.decode(varint_length)
44+
bytes.write_fixed(n, b) // ~ 3. b = result.read_fixed(n)
45+
}
46+
```
47+
48+
- Length prefixing with reverse
49+
50+
``` ckit
51+
// idempotent function: reverse(reverse(b)) == b
52+
extern fn reverse(b: [n]u8) ~ reverse(b: [n]u8);
53+
54+
fn length_prefix.encode(b: [n: 0..1<<32]u8) ~ length_prefix.decode(bytes: [?: n+1..=n+5]u8) {
55+
bytes.write(varint_length) // ~ 1. varint_length: [?]u8 = result.read()
56+
varint_length = varint_u32.encode(n) // ~ 2. n = varint_u32.decode(varint_length)
57+
bytes.write_fixed(n, reversed) // ~ 3. reversed = result.read_fixed(n)
58+
reversed = reverse(b) // ~ 4. b = reverse(reversed)
59+
}
60+
61+
fn length_prefix.encode(b: [n: 0..1<<32]u8) ~ length_prefix.decode(bytes: [?: n+1..=n+5]u8) {
62+
varint_length = varint_u32.encode(n) // ~ 2. n = varint_u32.decode(varint_length)
63+
bytes.write(varint_length) // ~ 1. varint_length: [?]u8 = result.read()
64+
reversed = reverse(b) // ~ 4. b = reverse(reversed)
65+
bytes.write_fixed<n>(reversed) // ~ 3. reversed = result.read_fixed< n >()
66+
}
67+
```
68+
69+
- Syntax for state definition
70+
71+
``` ckit
72+
forward state [?]T {
73+
mut push(x: T) ~ pop() // dual method - we must call pop for every push in forward order
74+
mut write(x: [?]T) ~ read()
75+
mut<n: 0..1<<32> write_fixed(x: [n]T) ~ read_fixed()
76+
}
77+
78+
forward state BlocksLru {
79+
mut encode(offset: u32) ~ decode(encoded: u32)
80+
mut touch(offset: u32): void // no dual method - we must call touch with same argument in dual method
81+
}
82+
83+
forward state FwdBitStream {
84+
mut init(): void
85+
mut flush(): void
86+
mut close(): void
87+
mut<n: 0..=64> push(b: [n]bits) ~ pop()
88+
}
89+
90+
// for backward state we must generate dual operations in reverse order
91+
backward state BwdBitStream {
92+
mut init_write() ~ close_read()
93+
mut flush() ~ reload()
94+
mut close_write() ~ init_read()
95+
mut<n: 0..=64> push(b: [n]bits) ~ pop()
96+
}
97+
98+
// must not compile because we .decode can't be implemented
99+
fn invalid.encode(x: u32) ~ invalid.decode(bytes: [?]u32) {
100+
var ( blocks: BlocksLru )
101+
y = blocks.encode(x) // 2. ~ x = blocks_lru.decode(y) !!! y is unknown here but we can't violate state operations order
102+
z = blocks.encode(y) // 3. ~ y = blocks_lru.decode(z)
103+
bytes.push(z) // 1. ~ z = result.pop()
104+
}
105+
```
106+
107+
- [QOI](https://qoiformat.org/)
108+
109+
``` ckit
110+
111+
const RGBA = struct {
112+
r: 0..256,
113+
g: 0..256,
114+
b: 0..256,
115+
a: 0..256,
116+
}
117+
118+
const Header {
119+
magic: "qoif"
120+
width: 0..1<<32
121+
height: 0..1<<32
122+
channels: {3, 4}
123+
colorspace: {0, 1}
124+
}
125+
126+
const Operation = union {
127+
QOI_OP_RGB = struct { r: 0..256, g: 0..256, b: 0..256 },
128+
QOI_OP_RGBA = struct { r: 0..256, g: 0..256, b: 0..256, a: 0..256 },
129+
QOI_OP_INDEX = struct { index: 0..64 },
130+
QOI_OP_DIFF = struct { dr: -2..2, dg: -2..2, db: -2..2 },
131+
QOI_OP_LUMA = struct { dr: -32..32, dr_dg: -8..8, db_dg: -8..8 },
132+
QOI_OP_RUN = struct { run: 0..64 },
133+
}
134+
135+
fn qoi.encode({ header: Header, operations: [n]Operation }) ~ qoi.decode(bytes: [?]u8) {
136+
bytes.write_fixed<4>("qoif") // 1. ~ assert!(bytes.read_fixed<4>() == "qoif")
137+
bytes.write_fixed<4>(header.width) // 2. ~ header.width = bytes.read_fixed<4>()
138+
bytes.write_fixed<4>(header.height) // 3. ~ header.height = bytes.read_fixed<4>()
139+
bytes.write_fixed<4>(header.channels) // 4. ~ header.channels = bytes.read_fixed<4>()
140+
bytes.write_fixed<4>(header.colorspace) // 5. ~ header.colorspace = bytes.read_fixed<4>()
141+
142+
~ size = header.width * header.height // initialize counter only for .decode method
143+
scan operations ~ bytes {
144+
current = operations.pop()
145+
match current {
146+
.QOI_OP_RGB ~ bytes[0] == 0b11111110 {
147+
bytes.write_fixed<4>([0b11111110, current.r, current.g, current.b])
148+
~ size -= 1
149+
}
150+
.QOI_OP_RGBA ~ bytes[0] == 0b11111111 {
151+
bytes.write_fixed<5>([0b11111111, current.r, current.g, current.b, current.a])
152+
~ size -= 1
153+
}
154+
.QOI_OP_INDEX ~ bytes[0][0..2] == 0b00 {
155+
bytes.write_fixed<1>([0b00 || current.index])
156+
~ size -= 1
157+
}
158+
.QOI_OP_DIFF ~ bytes[0][0..2] == 0b01 {
159+
bytes.write_fixed<1>([0b01 || current.dr + 2 || current.dg + 2 || current.db + 2])
160+
~ size -= 1
161+
}
162+
.QOI_OP_LUMA ~ bytes[0][0..2] == 0b10 {
163+
bytes.write_fixed<2>([0b10 || current.dg + 32, current.dr_dg + 8 || current.db_dg + 8])
164+
~ size -= 1
165+
}
166+
.QOI_OP_RUN ~ bytes[0][0..2] == 0b11 {
167+
bytes.write_fixed<1>([0b11 || current.run])
168+
~ size -= current.run
169+
}
170+
}
171+
if ~ size == 0 {
172+
break
173+
}
174+
}
175+
bytes.write_fixed<8>([0, 0, 0, 0, 0, 0, 0, 1])
176+
}
177+
178+
forward state qoi.Recent {
179+
180+
}
181+
182+
fn qoi.compress(image: [height: 0..1<<32, width: 0..1<<32]RGBA) ~ qoi.decompress({ header: Header, operations: [n]Operation }) {
183+
header = Header { width: width, height: height, channels: 4, colorspace: 0 }
184+
// how to iterate over 2d array?
185+
scan image ~ operations {
186+
187+
}
188+
}
189+
190+
```

0 commit comments

Comments
 (0)