doc/compressor_with_sparse_array.txt

It is possible to speedup dictionary lookup.
We can store codes trie in single sparse array.
We can name it "next codes".

For example:
"a" (97) -> "c" (262)
         -> "b" (257)
         -> "a" (259) -> "c" (263)
                      -> "b" (260)
"b" (98) -> "a" (258)
"c" (99) -> "a" (261)

Next codes:
...
97 ("a") * 256 + 97 ("a") -> 259
97 ("a") * 256 + 98 ("b") -> 257
97 ("a") * 256 + 99 ("c") -> 262
...
98 ("b") * 256 + 97 ("a") -> 258
...
99 ("c") * 256 + 97 ("a") -> 261
...
258 (259) ("aa") * 256 + 98 ("b") -> 260
258 (259) ("aa") * 256 + 99 ("c") -> 263
...

We need to subtract 1 because we don't need to store clear code.

-----

The main idea is very simple.

We have a lookup function with params: current_code and next_symbol.
We can use: return sparse_array[current_code * 256 + next_symbol].

We have a save function with params: current_code, next_symbol, next_code.
We can use: sparse_array[current_code * 256 + next_symbol] = next_code;

Both operations has O(1) compexity.

Indexes of this array are between 0 and ((2 ** 16) - 1) * 256 - 1.
Values >= 257.
We can still use zero constant as undefined next code.

-----

How much memory will it consume?
(((2 ** 16) - 1) * 256) * sizeof(code_t) ~ 33.5 MB.

For example we have "linux.tar" with size about 850 MB.
Practice shows that we need about 2040 dictionary clears.

We need to clear 33.5 MB * 2040 times ~ 68 GB.
Speed of memset is about 10 GB per second for modern pc with DDR-4.
So we are loosing about 7 seconds just to clear sparse array.
We need to find a smarter way to clear it.

We can store used sparse array indexes in separate array.
Length of this array equals to codes length: ((2 ** 16) - 257)
Size of used index is 4 bytes.
So it will consume ((2 ** 16) - 257) * sizeof(used_index_t) ~ 261 KB.

There is another solution.
We can use decompressor dictionary: previous codes and last symbol by codes.
It requires ((2 ** 16) - 257) * sizeof(code_t) + ((2 ** 16) - 257) ~ 196 KB.

But this solution is slower: we need to maintain more complex structure.
We are using 33.5 MB for sparse array.
It would be weird try to save 65 KB.

-----

PS If block mode is disabled we don't need clear code (256).
So we don't need to subtract 1 from current code.