|
| 1 | +module Bytes exposing |
| 2 | + ( Bytes |
| 3 | + , width |
| 4 | + , Endianness(..) |
| 5 | + , getHostEndianness |
| 6 | + ) |
| 7 | + |
| 8 | + |
| 9 | +{-| |
| 10 | +
|
| 11 | +# Bytes |
| 12 | +@docs Bytes, width |
| 13 | +
|
| 14 | +# Endianness |
| 15 | +@docs Endianness, getHostEndianness |
| 16 | +
|
| 17 | +-} |
| 18 | + |
| 19 | + |
| 20 | +import Elm.Kernel.Bytes |
| 21 | +import Task exposing (Task) |
| 22 | + |
| 23 | + |
| 24 | +-- BYTES |
| 25 | + |
| 26 | + |
| 27 | +{-| A sequence of bytes. |
| 28 | +
|
| 29 | +A byte is a chunk of eight bits. For example, the letter `j` is usually |
| 30 | +represented as the byte `01101010`, and the letter `k` is `01101011`. |
| 31 | +
|
| 32 | +Seeing each byte as a stream of zeros and ones can be quite confusing though, |
| 33 | +so it is common to use hexidecimal numbers instead: |
| 34 | +
|
| 35 | +``` |
| 36 | +| Binary | Hex | |
| 37 | ++--------+-----+ |
| 38 | +| 0000 | 0 | |
| 39 | +| 0001 | 1 | |
| 40 | +| 0010 | 2 | |
| 41 | +| 0011 | 3 | j = 01101010 |
| 42 | +| 0100 | 4 | \__/\__/ |
| 43 | +| 0101 | 5 | | | |
| 44 | +| 0110 | 6 | 6 A |
| 45 | +| 0111 | 7 | |
| 46 | +| 1000 | 8 | k = 01101011 |
| 47 | +| 1001 | 9 | \__/\__/ |
| 48 | +| 1010 | A | | | |
| 49 | +| 1011 | B | 6 B |
| 50 | +| 1100 | C | |
| 51 | +| 1101 | D | |
| 52 | +| 1110 | E | |
| 53 | +| 1111 | F | |
| 54 | +``` |
| 55 | +
|
| 56 | +So `j` is `6A` and `k` is `6B` in hexidecimal. This more compact representation |
| 57 | +is great when you have a sequence of bytes. You can see this even in a short |
| 58 | +string like `"jazz"`: |
| 59 | +
|
| 60 | +``` |
| 61 | +binary hexidecimal |
| 62 | +01101010 01100001 01111010 01111010 => 6A 61 7A 7A |
| 63 | +``` |
| 64 | +
|
| 65 | +Anyway, the point is that `Bytes` is a sequence of bytes! |
| 66 | +-} |
| 67 | +type Bytes = Bytes |
| 68 | + |
| 69 | + |
| 70 | +{-| Get the width of a sequence of bytes. |
| 71 | +
|
| 72 | +So if a sequence has four-hundred bytes, then `width bytes` would give back |
| 73 | +`400`. That may be 400 unsigned 8-bit integers, 100 signed 32-bit integers, or |
| 74 | +even a UTF-8 string. The content does not matter. This is just figuring out |
| 75 | +how many bytes there are! |
| 76 | +-} |
| 77 | +width : Bytes -> Int |
| 78 | +width = |
| 79 | + Elm.Kernel.Bytes.width |
| 80 | + |
| 81 | + |
| 82 | + |
| 83 | +-- ENDIANNESS |
| 84 | + |
| 85 | + |
| 86 | +{-| Different computers store integers and floats slightly differently in |
| 87 | +memory. Say we have the integer `0x1A2B3C4D` in our program. It needs four |
| 88 | +bytes (32 bits) in memory. It may seem reasonable to lay them out in order: |
| 89 | +
|
| 90 | +``` |
| 91 | + Big-Endian (BE) (Obvious Order) |
| 92 | ++----+----+----+----+ |
| 93 | +| 1A | 2B | 3C | 4D | |
| 94 | ++----+----+----+----+ |
| 95 | +``` |
| 96 | +
|
| 97 | +But some people thought it would be better to store the bytes in the opposite |
| 98 | +order: |
| 99 | +
|
| 100 | +``` |
| 101 | + Little-Endian (LE) (Shuffled Order) |
| 102 | ++----+----+----+----+ |
| 103 | +| 4D | 3C | 2B | 1A | |
| 104 | ++----+----+----+----+ |
| 105 | +``` |
| 106 | +
|
| 107 | +Notice that **the _bytes_ are shuffled, not the bits.** It is like if you cut a |
| 108 | +photo into four strips and shuffled the strips. It is not a mirror image. |
| 109 | +The theory seems to be that an 8-bit `0x1A` and a 32-bit `0x0000001A` both have |
| 110 | +`1A` as the first byte in this scheme. Maybe this was helpful when processors |
| 111 | +handled one byte at a time. |
| 112 | +
|
| 113 | +**Most processors use little-endian (LE) layout.** This seems to be because |
| 114 | +Intel did it this way, and other chip manufactures followed their convention. |
| 115 | +**Most network protocols use big-endian (BE) layout.** I suspect this is |
| 116 | +because if you are trying to debug a network protocol, it is nice if your |
| 117 | +integers are not all shuffled. |
| 118 | +
|
| 119 | +**Note:** Endianness is relevant for integers and floats, but not strings. |
| 120 | +UTF-8 specifies the order of bytes explicitly. |
| 121 | +
|
| 122 | +**Note:** The terms little-endian and big-endian are a reference to an egg joke |
| 123 | +in Gulliver's Travels. They first appeared in 1980 in [this essay][essay], and |
| 124 | +you can decide for yourself if they stood the test of time. I personally find |
| 125 | +these terms quite unhelpful, so I say “Obvious Order” and “Shuffled Order” in |
| 126 | +my head. I remember which is more common by asking myself, “if things were |
| 127 | +obvious, would I have to ask this question?” |
| 128 | +
|
| 129 | +[essay]: http://www.ietf.org/rfc/ien/ien137.txt |
| 130 | +-} |
| 131 | +type Endianness = LE | BE |
| 132 | + |
| 133 | + |
| 134 | +{-| Is this program running on a big-endian or little-endian machine? |
| 135 | +-} |
| 136 | +getHostEndianness : Task x Endianness |
| 137 | +getHostEndianness = |
| 138 | + Elm.Kernel.Bytes.getHostEndianness LE BE |
0 commit comments