$ ./huffman
Huffman coding algorithm
by Sergey Tikhonov ([email protected])
Usage: huffman [OPTIONS] input [output]
The default action is to encode input file.
-d Decode file.
$ cat input.txt
In computer science and information theory, Huffman coding is an entropy
encoding algorithm used for lossless data compression. The term refers to the
use of a variable-length code table for encoding a source symbol (such as a
character in a file) where the variable-length code table has been derived in a
particular way based on the estimated probability of occurrence for each
possible value of the source symbol. It was developed by David A. Huffman while
he was a Ph.D. student at MIT, and published in the 1952 paper "A Method for the
Construction of Minimum-Redundancy Codes".
$ ./huffman input.txt
41
0.164666 110
e 0.099485 001
a 0.066895 1010
o 0.061750 1000
n 0.051458 0110
t 0.051458 0101
r 0.048027 0001
s 0.048027 0000
i 0.046312 11111
d 0.037736 11100
c 0.036021 10111
h 0.036021 10110
l 0.032590 10011
u 0.027444 01111
f 0.025729 01001
b 0.022298 111101
m 0.020583 111100
p 0.017153 100101
y 0.013722 011101
. 0.010292 010000
g 0.010292 1110111
v 0.010292 1110110
w 0.008576 1001001
- 0.005146 11101011
M 0.005146 11101010
I 0.005146 11101001
, 0.003431 11101000
T 0.003431 01110001
D 0.003431 01110000
C 0.003431 01110011
H 0.003431 01110010
" 0.003431 01000101
A 0.003431 01000100
2 0.001715 100100001
R 0.001715 100100000
9 0.001715 100100011
5 0.001715 100100010
P 0.001715 010001101
1 0.001715 010001100
) 0.001715 010001111
( 0.001715 010001110
11101001011011010111100011110010010101111010100100011100000101111111100101101011
10011101010011011100110111110110010011000000111110010100101111111000011011001011
01100011000000101110111101000110011100100111101001010011111001010011011010111100
01110011111011011101111101111100001101010011011000101100101000110001001010111011
10001011010111100011100111110110111011111010101001111101111000000111111010110110
11110011001111000000111100110010011000000111010011100000000000100110010000000011
01110010100101101011010111100011110010010100010010000000011111100001100100001100
11100011011000111001010010001111100110000100101001001000100001100101100011001011
01100011100111100000011101000010011101010110111011010100001111111010111101100110
01111010111001100101101110111010110110110101111000111000011100101101011110110011
00111001001100000011100010110101111000111001111101101110111110101011000001000011
11000110111001110000001110111110011110110001001111001000111000000111110111101101
10101000001101010110101111011010100001101010111010100100011101111101101101010110
01001111111001100101000111111010010011011000100010011100101101100011101110110101
00001111111010111101100110011110101110011001011011101110101101101101011110001110
00011100101101011110110011001110101101010000011011110100100101101101110000100011
11111110110001111001101111101101101010110100101101000010101111111011101111100111
01000011101001001101001110111011110110100000001111001101000011011001011011000111
00010000010111111111100101001010011110011010010100011000111101101011110111111100
11111110101011101110100001001110100010111101110111100010001001011010111001110010
01100000011100011010101111011011010010110000000000011111111101100110011101110110
10101001101111001110100001001110010110110001110000010000111100011011100111000000
11101111100111101100010011010000110111010010101110100100110100000110111000011110
11000110011100010010100111100110111101011101110011100001010111011011111111001100
10001000100001100111001001111010010100111110010100110110100100110110111111001100
11101011000111010010011010000011010101100100011011011001000001110000010000110000
00101011111110000101100101110101001011101110101011101001011100011110100011010100
11011100110100101011111111011001111111000010110001111001101111101101100101101100
01110010001100100100011100100010100100001110100101101010010100100011100100010101
00010011011101010001010110110100011100110010011000000111001011011000111001110011
10000110000001010001011111011101011111110000110110100001001110111010101111101101
11111111000111111110011101011100100000001111000111101101110010100110101110111011
1001110011100011100001000001000101010000