diff --git a/Chapters/01-memory.qmd b/Chapters/01-memory.qmd index 4ace84e..c9838d1 100644 --- a/Chapters/01-memory.qmd +++ b/Chapters/01-memory.qmd @@ -387,6 +387,47 @@ object you declare is stored: 1. the heap can only be accessed through allocators. If your object was not created through the `alloc()` or `create()` methods of an allocator object, then, he is most certainly not an object stored in the heap. +## Stack overflows {#sec-stack-overflow} + +Allocating memory on the stack is generally faster than allocating it on the heap. +But this better performance comes with many restrictions. We have already discussed +many of these restrictions of the stack at @sec-stack. But there is one more important +limitation that I want to talk about, which is the size of the stack itself. + +The stack is limited in size. This size varies from computer to computer, and it depends on +a lot of things (the computer architecture, the operating system, etc.). Nevertheless, this size is usually +not that big. This is why we normally use the stack to store only temporary and small objects in memory. + +In essence, if you try to make an allocation on the stack, that is so big that exceeds the stack size limit, +a *stack overflow* happens, and your program just crashes as a result of that. In other words, a stack overflow happens when +you attempt to use more space than is available on the stack. + +This type of problem is very similar to a *buffer overflow*, i.e. you are trying to use more space +than is available in the "buffer object". However, a stack overflow always cause your program to crash, +while a buffer overflow not always cause your program to crash (although it often does). + +You can see an example of a stack overflow in the example below. We are trying to allocate a very big array of `u64` values +on the stack. You can see below that this program does not run succesfully, because it crashed +with a "segmentation fault" error message. + +```{zig} +#| eval: false +#| build_type: "run" +#| auto_main: true +var very_big_alloc: [1000 * 1000 * 24]u64 = undefined; +@memset(very_big_alloc[0..], 0); +``` + +``` +Segmentation fault (core dumped) +``` + +This segmentation fault error is a result of the stack overflow that was caused by the big +memory allocation made on the stack, to store the `very_big_alloc` object. +This is why very big objects are usually stored on the heap, instead of the stack. + + + ## Allocators {#sec-allocators} One key aspect about Zig, is that there are "no hidden-memory allocations" in Zig. @@ -406,14 +447,14 @@ allocate memory. Just look at the arguments of this function. If a function, or operator, have an allocator object as one of its inputs/arguments, then, you know for sure that this function/operator will allocate some memory during its execution. -An example is the `allocPrint()` function from the Zig standard library. With this function, you can +An example is the `allocPrint()` function from the Zig Standard Library. With this function, you can write a new string using format specifiers. So, this function is, for example, very similar to the function `sprintf()` in C. In order to write such new string, the `allocPrint()` function needs to allocate some memory to store the output string. That is why, the first argument of this function is an allocator object that you, the user/programmer, gives as input to the function. In the example below, I am using the `GeneralPurposeAllocator()` as my allocator -object. But I could easily use any other type of allocator object from the Zig standard library. +object. But I could easily use any other type of allocator object from the Zig Standard Library. ```{zig} #| auto_main: true @@ -430,9 +471,8 @@ try stdout.print("{s}\n", .{output}); ``` -You get a lot of control -over where and how much memory this function can allocate. Because it is you, -the user/programmer, that provides the allocator for the function to use. +You get a lot of control over where and how much memory this function can allocate. +Because it is you, the user/programmer, that provides the allocator for the function to use. This makes "total control" over memory management easier to achieve in Zig. ### What are allocators? @@ -440,14 +480,14 @@ This makes "total control" over memory management easier to achieve in Zig. Allocators in Zig are objects that you can use to allocate memory for your program. They are similar to the memory allocating functions in C, like `malloc()` and `calloc()`. So, if you need to use more memory than you initially have, during the execution of your program, you can simply ask -for more memory using an allocator. +for more memory by using an allocator object. Zig offers different types of allocators, and they are usually available through the `std.heap` module of -the standard library. So, just import the Zig standard library into your Zig module (with `@import("std")`), and you can start +the standard library. Thus, just import the Zig Standard Library into your Zig module (with `@import("std")`), and you can start using these allocators in your code. -Furthermore, every allocator object is built on top of the `Allocator` interface in Zig. This -means that, every allocator object you find in Zig must have the methods `alloc()`, +Furthermore, every allocator object is built on top of the `Allocator` interface in Zig. +This means that, every allocator object you find in Zig must have the methods `alloc()`, `create()`, `free()` and `destroy()`. So, you can change the type of allocator you are using, but you don't need to change the function calls to the methods that do the memory allocation (and the free memory operations) for your program. @@ -480,7 +520,7 @@ The heap fit this description. Allocating memory on the heap is commonly known as dynamic memory management. As the objects you create grow in size during the execution of your program, you grow the amount of memory you have by allocating more memory in the heap to store these objects. -And you that in Zig, by using an allocator object. +And you do that in Zig, by using an allocator object. ### The different types of allocators @@ -543,11 +583,20 @@ in each call, and you most likely will not need that much memory in your program ### Buffer allocators The `FixedBufferAllocator()` and `ThreadSafeFixedBufferAllocator()` are allocator objects that -work with a fixed sized buffer that is stored in the stack. So these two allocators only allocates -memory in the stack. This also means that, in order to use these allocators, you must first -create a buffer object, and then, give this buffer as an input to these allocators. +work with a fixed sized buffer object at the back. In other words, they use a fixed sized buffer +object as the basis for the memory. When you ask these allocator objects to allocate some memory for you, +they are essentially reserving some amount of space inside this fixed sized buffer object for you to use. -In the example below, I am creating a `buffer` object that is 10 elements long. +This means that, in order to use these allocators, you must first create a buffer object in your code, +and then, give this buffer object as an input to these allocators. + +This also means that, these allocator objects can allocate memory both in the stack or in the heap. +Everything depends on where the buffer object that you provide lives. If this buffer object lives +in the stack, then, the memory allocated is "stack-based". But if it lives on the heap, then, +the memory allocated is "heap-based". + + +In the example below, I'm creating a `buffer` object on the stack that is 10 elements long. Notice that I give this `buffer` object to the `FixedBufferAllocator()` constructor. Now, because this `buffer` object is 10 elements long, this means that I am limited to this space. I cannot allocate more than 10 elements with this allocator object. If I try to @@ -567,6 +616,36 @@ const input = try allocator.alloc(u8, 5); defer allocator.free(input); ``` +Remember, the memory allocated by these allocator objects can be either from +the stack, or, from the heap. It all depends on where the buffer object that you provide lives. +In the above example, the `buffer` object lives in the stack, and, therefore, the memory allocated +is based in the stack. But what if it was based on the heap? + +As we described at @sec-stack-overflow, one of the main reasons why you would use the heap, +instead of the stack, is to allocate huge amounts of space to store very big objects. +Thus, let's suppose you wanted to use a very big buffer object as the basis for your +allocator objects. You would have to allocate this very big buffer object on the heap. +The example below demonstrates this case. + +```{zig} +#| eval: false +#| build_type: "run" +#| auto_main: true +const heap = std.heap.page_allocator; +const memory_buffer = try heap.alloc( + u8, 100 * 1024 * 1024 // 100 MB memory +); +defer heap.free(memory_buffer); +var fba = std.heap.FixedBufferAllocator.init( + memory_buffer +); +const allocator = fba.allocator(); + +const input = try allocator.alloc(u8, 1000); +defer allocator.free(input); +``` + + ### Arena allocator {#sec-arena-allocator} diff --git a/Chapters/01-zig-weird.qmd b/Chapters/01-zig-weird.qmd index 3868e9b..c9968ef 100644 --- a/Chapters/01-zig-weird.qmd +++ b/Chapters/01-zig-weird.qmd @@ -1316,7 +1316,7 @@ that are very useful to use when working with strings. Most notably: - `std.mem.splitScalar()`: to split a string into an array of substrings given a delimiter value. - `std.mem.splitSequence()`: to split a string into an array of substrings given a substring delimiter. - `std.mem.startsWith()`: to check if string starts with substring. -- `std.mem.endsWith()`: to check if string starts with substring. +- `std.mem.endsWith()`: to check if string ends with substring. - `std.mem.trim()`: to remove specific values from both start and end of the string. - `std.mem.concat()`: to concatenate strings together. - `std.mem.count()`: to count the occurrences of substring in the string. @@ -1465,7 +1465,7 @@ In other words, the `zig` compiler does not obligates you to use such tools. The tools listed below are related to memory safety. That is, they help you to achieve memory safety in your Zig code: -- `defer` allows you to keep free operations phisically close to allocations. This helps you to avoid memory leaks, "use after free", and also "double-free" problems. Furthermore, it also keeps free operations logically tied to the end of the current scope, which greatly reduces the mental overhead about object lifetime. +- `defer` allows you to keep free operations physically close to allocations. This helps you to avoid memory leaks, "use after free", and also "double-free" problems. Furthermore, it also keeps free operations logically tied to the end of the current scope, which greatly reduces the mental overhead about object lifetime. - `errdefer` helps you to guarantee that your program frees the allocated memory, even if a runtime error occurs. - pointers and objects are non-nullable by default. This helps you to avoid memory problems that might arise from de-referencing null pointers. - Zig offers some native types of allocators (called "testing allocators") that can detect memory leaks and double-frees. These types of allocators are widely used on unit tests, so they transform your unit tests into a weapon that you can use to detect memory problems in your code. diff --git a/Chapters/03-structs.qmd b/Chapters/03-structs.qmd index e02bc32..09cfdc7 100644 --- a/Chapters/03-structs.qmd +++ b/Chapters/03-structs.qmd @@ -606,7 +606,7 @@ But I want to emphasize a curious fact about function parameters (a.k.a. functio In summary, function parameters are immutable in Zig. Take the code example below, where we declare a simple function that just tries to add -some amount to the input integer, and returns the result back. But if you look closely +some amount to the input integer, and returns the result back. If you look closely at the body of this `add2()` function, you will notice that we try to save the result back into the `x` function argument. @@ -616,8 +616,8 @@ into `x`. However, function arguments in Zig are immutable. You cannot change th cannot assign values to them inside the body's function. This is the reason why, the code example below do not compile successfully. If you try to compile -this code example, you get a compile error warning you that you are trying to change the value of a -immutable (i.e. constant) object. +this code example, you will get a compile error message about "trying to change the value of a +immutable (i.e. constant) object". ```{zig} #| eval: false @@ -640,6 +640,8 @@ t.zig:3:5: error: cannot assign to constant ``` +### A free optimization + If a function argument receives as input an object whose data type is any of the primitive types that we have listed in @sec-primitive-data-types, this object is always passed by value to the function. In other words, this object @@ -653,16 +655,20 @@ choose the strategy that is faster for you. This optimization that you get for free is possible only because function arguments are immutable in Zig. + +### How to overcome this barrier + There are some situations where you might need to change the value of your function argument directly inside the function's body. This happens more often when we are passing C structs as inputs to Zig functions. -In a situation like this, you can overcome this barrier of immutable function arguments, by simply taking the lead, -and explicitly choosing to pass the object by reference to the function. -That is, instead of depending on the `zig` compiler to decide which strategy is best, you have -to explicitly mark the function argument as a pointer. This way, we are telling the compiler -that this function argument will be passed by reference to the function. +In a situation like this, you can overcome this barrier by using a pointer. In other words, +instead of passing a value as input to the argument, you can pass a "pointer to value" instead. +You can change the value that the pointer points to, by dereferencing it. +Therefore, if we take our previous `add2()` example, we can change the value of the +function argument `x` inside the function's body by marking the `x` argument as a +"pointer to a `u32` value" (i.e. `*u32` data type), instead of a `u32` value. By making it a pointer, we can finally alter the value of this function argument directly inside the body of the `add2()` function. You can see that the code example below compiles successfully. @@ -687,6 +693,13 @@ Result: 6 ``` +Even in this code example above, the `x` argument is still immutable. Which means that the pointer itself is immutable. +Therefore, you cannot change the memory address that it points to. However, you can dereference the pointer +to access the value that it points to, and also, to change this value, if you need to. + + + + ## Structs and OOP {#sec-structs-and-oop} @@ -988,7 +1001,7 @@ const Vec3 = struct { return m.sqrt(xd + yd + zd); } - pub fn double(self: *Vec3) void { + pub fn twice(self: *Vec3) void { self.x = self.x * 2.0; self.y = self.y * 2.0; self.z = self.z * 2.0; @@ -997,9 +1010,9 @@ const Vec3 = struct { ``` Notice in the code example above that we have added a new method -to our `Vec3` struct named `double()`. This method doubles the +to our `Vec3` struct named `twice()`. This method doubles the coordinate values of our vector object. In the -case of the `double()` method, we annotated the `self` argument as `*Vec3`, +case of the `twice()` method, we annotated the `self` argument as `*Vec3`, indicating that this argument receives a pointer (or a reference, if you prefer to call it this way) to a `Vec3` object as input. @@ -1008,7 +1021,7 @@ to a `Vec3` object as input. var v3 = Vec3 { .x = 4.2, .y = 2.4, .z = 0.9 }; -v3.double(); +v3.twice(); std.debug.print("Doubled: {d}\n", .{v3.x}); ``` @@ -1018,15 +1031,15 @@ Doubled: 8.4 -Now, if you change the `self` argument in this `double()` method to `self: Vec3`, like in the +Now, if you change the `self` argument in this `twice()` method to `self: Vec3`, like in the `distance()` method, you will get the compiler error exposed below as result. Notice that this -error message is showing a line from the `double()` method body, +error message is showing a line from the `twice()` method body, indicating that you cannot alter the value of the `x` data member. ```{zig} #| eval: false // If we change the function signature of double to: - pub fn double(self: Vec3) void { + pub fn twice(self: Vec3) void { ``` ``` diff --git a/Chapters/10-stack-project.qmd b/Chapters/10-stack-project.qmd index e78bdde..36fd0fc 100644 --- a/Chapters/10-stack-project.qmd +++ b/Chapters/10-stack-project.qmd @@ -68,22 +68,22 @@ function argument that is not known at compile time, the `zig` compiler will not as a consequence, it will raise a compilation error saying that it cannot compile your program. Because you are providing a value that is "runtime known" to a function argument that must be "compile-time known". -Take a look at this very simple example below, where we define a `double()` function, that simply +Take a look at this very simple example below, where we define a `twice()` function, that simply doubles the input value named `num`. Notice that we use the `comptime` keyword before the name of the function argument. This keyword is marking the function argument `num` as a "comptime argument". That is a function argument whose value must be compile-time known. This is why the expression -`double(5678)` is valid, and no compilation errors are raised. Because the value `5678` +`twice(5678)` is valid, and no compilation errors are raised. Because the value `5678` is compile-time known, so this is the expected behaviour for this function. ```{zig} #| auto_main: false #| build_type: "test" -fn double(comptime num: u32) u32 { +fn twice(comptime num: u32) u32 { return num * 2; } test "test comptime" { - _ = double(5678); + _ = twice(5678); } ``` @@ -92,14 +92,14 @@ For example, we might provide a different input value to this function depending on the target OS of our compilation process. The code example below demonstrates such case. Because the value of the object `n` is determined at runtime, we cannot provide this object -as input to the `double()` function. The `zig` compiler will not allow it, because we marked +as input to the `twice()` function. The `zig` compiler will not allow it, because we marked the `num` argument as a "comptime argument". That is why the `zig` compiler raises the compile-time error exposed below: ```{zig} #| eval: false const builtin = @import("builtin"); -fn double(comptime num: u32) u32 { +fn twice(comptime num: u32) u32 { return num * 2; } test "test comptime" { @@ -109,7 +109,7 @@ test "test comptime" { } else { n = 5678; } - _ = double(n); + _ = twice(n); } ``` diff --git a/_freeze/Chapters/01-memory/execute-results/html.json b/_freeze/Chapters/01-memory/execute-results/html.json index e56d7f8..45d5cb2 100644 --- a/_freeze/Chapters/01-memory/execute-results/html.json +++ b/_freeze/Chapters/01-memory/execute-results/html.json @@ -1,9 +1,11 @@ { - "hash": "ecaae861d3a311b99ed3a8fb0913c809", + "hash": "377038093bffc7a98a1d853d42a515cb", "result": { "engine": "knitr", - "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n\n\n\n\n# Memory and Allocators\n\n\nIn this chapter, we will talk about memory. How does Zig controls memory? What\ncommon tools are used? Are there any important aspect that makes memory\ndifferent/special in Zig? You will find the answers here.\n\nComputers fundamentally rely on memory to function. This memory acts as a temporary storage\nspace for the data and values generated during computations. Without memory, the core\nconcepts of \"variables\" and \"objects\" in programming languages would be impossible.\n\n\n\n\n## Memory spaces\n\nEvery object that you create in your Zig source code needs to be stored somewhere,\nin your computer's memory. Depending on where and how you define your object, Zig\nwill use a different \"memory space\", or a different\ntype of memory to store this object.\n\nEach type of memory normally serves for different purposes.\nIn Zig, there are 3 types of memory (or 3 different memory spaces) that we care about. They are:\n\n- Global data register (or the \"global data section\");\n- Stack;\n- Heap;\n\n\n### Compile-time known versus runtime known {#sec-compile-time}\n\nOne strategy that Zig uses to decide where it will store each object that you declare, is by looking\nat the value of this particular object. More specifically, by investigating if this value is\nknown at \"compile-time\" or at \"runtime\".\n\nWhen you write a program in Zig, the values of some of the objects that you write in your program are *known\nat compile time*. Meaning that, when you compile your Zig source code, during the compilation process,\nthe `zig` compiler can figure it out what is the exact value of a particular object\nthat exists in your source code.\nKnowing the length (or the size) of each object is also important. So the length (or the size) of each object that you write in your program is,\nin some cases, *known at compile time*.\n\nThe `zig` compiler cares more about knowing the length (or the size) of a particular object\n, than to know its actual value. But, if the `zig` compiler knows the value of the object, then, it\nautomatically knows the size of this object. Because it can simply calculate the\nsize of the object by looking at the size of the value.\n\nTherefore, the priority for the `zig` compiler is to discover the size of each object in your source code.\nIf the value of the object in question is known at compile-time, then, the `zig` compiler\nautomatically knows the size/length of this object. But if the value of this object is not\nknown at compile-time, then, the size of this object is only known at compile-time if,\nand only if, the type of this object have a known fixed size.\n\nIn order to a type have a known fixed size, this type must have data members whose size is fixed.\nIf this type includes, for example, a variable sized array in it, then, this type do not have a known\nfixed size. Because this array can have any size at runtime\n(i.e. it can be an array of 2 elements, or 50 elements, or 1 thousand elements, etc.).\n\nFor example, a string object, which internally is an array of constant u8 values (`[]const u8`)\nhave a variable size. It can be a string object with 100 or 500 characters in it. If we do not\nknow at compile-time, which exact string will be stored inside this string object, then, we cannot calculate\nthe size of this string object at compile-time. So, any type, or any struct declaration that you make, that\nincludes a string data member that do not have an explicit fixed size, makes this type, or this\nnew struct that you are declaring, a type that do not have a known fixed size at compile-time.\n\nIn contrast, if the type or this struct that you are declaring, includes a data member that is an array,\nbut this array have a known fixed size, like `[60]u8` (which declares an array of 60 `u8` values), then,\nthis type, or, this struct that you are declaring, becomes a type with a known fixed size at compile-time.\nAnd because of that, in this case, the `zig` compiler do not need to known at compile-time the exact value of\nany object of this type. Since the compiler can find the necessary size to store this object by\nlooking at the size of its type.\n\n\nLet's look at an example. In the source code below, we have two constant objects (`name` and `array`) declared.\nBecause the values of these particular objects are written down, in the source code itself (`\"Pedro\"`\nand the number sequence from 1 to 4), the `zig` compiler can easily discover the values of these constant\nobjects (`name` and `array`) during the compilation process.\nThis is what \"known at compile time\" means. It refers to any object that you have in your Zig source code\nwhose value can be identified at compile time.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst name = \"Pedro\";\nconst array = [_]u8{1, 2, 3, 4};\n_ = name; _ = array;\n\nfn input_length(input: []const u8) usize {\n const n = input.len;\n return n;\n}\n```\n:::\n\n\n\n\nThe other side of the spectrum are objects whose values are not known at compile time.\nFunction arguments are a classic example of this. Because the value of each function\nargument depends on the value that you assign to this particular argument,\nwhen you call the function.\n\nFor example, the function `input_length()` contains an argument named `input`, which is an array of constant `u8` integers (`[]const u8`).\nIs impossible to know at compile time the value of this particular argument. And it also is impossible to know the size/length\nof this particular argument. Because it is an array that do not have a fixed size specified explicitly in the argument type annotation.\n\nSo, we know that this `input` argument will be an array of `u8` integers. But we do not know at compile-time, its value, and neither his size.\nThis information is known only at runtime, which is the period of time when you program is executed.\nAs a consequence, the value of the expression `input.len` is also known only at runtime.\nThis is an intrinsic characteristic of any function. Just remember that the value of function arguments is usually not \"compile-time known\".\n\nHowever, as I mentioned earlier, what really matters to the compiler is to know the size of the object\nat compile-time, and not necessarily its value. So, although we don't know the value of the object `n`, which is the result of the expression\n`input.len`, at compile-time, we do know its size. Because the expression `input.len` always return a value of type `usize`,\nand the type `usize` have a known fixed size.\n\n\n\n### Global data register\n\nThe global data register is a specific section of the executable of your Zig program, that is responsible\nfor storing any value that is known at compile time.\n\nEvery constant object whose value is known at compile time that you declare in your source code,\nis stored in the global data register. Also, every literal value that you write in your source code,\nsuch as the string `\"this is a string\"`, or the integer `10`, or a boolean value such as `true`,\nis also stored in the global data register.\n\nHonestly, you don't need to care much about this memory space. Because you can't control it,\nyou can't deliberately access it or use it for your own purposes.\nAlso, this memory space does not affect the logic of your program.\nIt simply exists in your program.\n\n\n### Stack vs Heap\n\nIf you are familiar with system's programming, or just low-level programming in general, you\nprobably have heard of the \"duel\" between Stack vs Heap. These are two different types of memory,\nor different memory spaces, which are both available in Zig.\n\nThese two types of memory don't actually duel with\neach other. This is a common mistake that beginners have, when seeing \"x vs y\" styles of\ntabloid headlines. These two types of memory are actually complementary to each other.\nSo, in almost every Zig program that you ever write, you will likely use a combination of both.\nI will describe each memory space in detail over the next sections. But for now, I just want to\nstablish the main difference between these two types of memory.\n\nIn essence, the stack memory is normally used to store values whose length is fixed and known\nat compile time. In contrast, the heap memory is a *dynamic* type of memory space, meaning that, it is\nused to store values whose length might grow during the execution (runtime) of your program [@jenny2022].\n\nLengths that grow during runtime are intrinsically associated with \"runtime known\" type of values.\nIn other words, if you have an object whose length might grow during runtime, then, the length\nof this object becomes not known at compile time. If the length is not known at compile-time,\nthe value of this object also becomes not known at compile-time.\nThese types of objects should be stored in the heap memory space, which is\na dynamic memory space, which can grow or shrink to fit the size of your objects.\n\n\n\n### Stack {#sec-stack}\n\nThe stack is a type of memory that uses the power of the *stack data structure*, hence the name. \nA \"stack\" is a type of *data structure* that uses a \"last in, first out\" (LIFO) mechanism to store the values\nyou give it to. I imagine you are familiar with this data structure.\nBut, if you are not, the [Wikipedia page](https://en.wikipedia.org/wiki/Stack_(abstract_data_type))[^wiki-stack]\n, or, the [Geeks For Geeks page](https://www.geeksforgeeks.org/stack-data-structure/)[^geek-stack] are both\nexcellent and easy resources to fully understand how this data structure works.\n\n[^wiki-stack]: \n[^geek-stack]: \n\nSo, the stack memory space is a type of memory that stores values using a stack data structure.\nIt adds and removes values from the memory by following a \"last in, first out\" (LIFO) principle.\n\nEvery time you make a function call in Zig, an amount of space in the stack is\nreserved for this particular function call [@jenny2022; @zigdocs].\nThe value of each function argument given to the function in this function call is stored in this\nstack space. Also, every local object that you declare inside the function scope is\nusually stored in this same stack space.\n\n\nLooking at the example below, the object `result` is a local object declared inside the scope of the `add()`\nfunction. Because of that, this object is stored inside the stack space reserved for the `add()` function.\nThe `r` object (which is declared outside of the `add()` function scope) is also stored in the stack.\nBut since it is declared in the \"outer\" scope, this object is stored in the\nstack space that belongs to this outer scope.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst r = add(5, 27);\n_ = r;\n\nfn add(x: u8, y: u8) u8 {\n const result = x + y;\n return result;\n}\n```\n:::\n\n\n\n\n\nSo, any object that you declare inside the scope of a function is always stored inside\nthe space that was reserved for that particular function in the stack memory. This\nalso counts for any object declared inside the scope of your `main()` function for example.\nAs you would expect, in this case, they\nare stored inside the stack space reserved for the `main()` function.\n\nOne very important detail about the stack memory is that **it frees itself automatically**.\nThis is very important, remember that. When objects are stored in the stack memory,\nyou don't have the work (or the responsibility) of freeing/destroying these objects.\nBecause they will be automatically destroyed once the stack space is freed at the end of the function scope.\n\nSo, once the function call returns (or ends, if you prefer to call it this way)\nthe space that was reserved in the stack is destroyed, and all of the objects that were in that space goes away with it.\nThis mechanism exists because this space, and the objects within it, are not necessary anymore,\nsince the function \"finished its business\".\nUsing the `add()` function that we exposed above as an example, it means that the object `result` is automatically\ndestroyed once the function returns.\n\n::: {.callout-important}\nLocal objects that are stored in the stack space of a function are automatically\nfreed/destroyed at the end of the function scope.\n:::\n\n\nThis same logic applies to any other special structure in Zig that have its own scope by surrounding\nit with curly braces (`{}`).\nFor loops, while loops, if else statements, etc. For example, if you declare any local\nobject in the scope of a for loop, this local object is accessible only within the scope\nof this particular for loop. Because once the scope of this for loop ends, the space in the stack\nreserved for this for loop is freed.\nThe example below demonstrates this idea.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// This does not compile successfully!\nconst a = [_]u8{0, 1, 2, 3, 4};\nfor (0..a.len) |i| {\n const index = i;\n _ = index;\n}\n// Trying to use an object that was\n// declared in the for loop scope,\n// and that does not exist anymore.\nstd.debug.print(\"{d}\\n\", index);\n```\n:::\n\n\n\n\n\n\nOne important consequence of this mechanism is that, once the function returns, you can no longer access any memory\naddress that was inside the space in the stack reserved for this particular function. Because this space was\ndestroyed. This means that, if this local object is stored in the stack,\nyou cannot make a function that **returns a pointer to this object**.\n\nThink about that for a second. If all local objects in the stack are destroyed at the end of the function scope, why\nwould you even consider returning a pointer to one of these objects? This pointer is at best,\ninvalid, or, more likely, \"undefined\".\n\nConclusion, is totally fine to write a function that returns the local object\nitself as result, because then, you return the value of that object as the result.\nBut, if this local object is stored in the stack, you should never write a function\nthat returns a pointer to this local object. Because the memory address pointed by the pointer\nno longer exists.\n\n\nSo, using again the `add()` function as an example, if you rewrite this function so that it\nreturns a pointer to the local object `result`, the `zig` compiler will actually compile\nyou program, with no warnings or erros. At first glance, it looks that this is good code\nthat works as expected. But this is a lie!\n\nIf you try to take a look at the value inside of the `r` object,\nor, if you try to use this `r` object in another expression\nor function call, then, you would have undefined behaviour, and major\nbugs in your program [@zigdocs, see \"Lifetime and Ownership\"[^life] and \"Undefined Behaviour\"[^undef] sections].\n\n[^life]: \n[^undef]: \n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// This code compiles successfully. But it has\n// undefined behaviour. Never do this!!!\n\n// The `r` object is undefined!\nconst r = add(5, 27);\n_ = r;\n\nfn add(x: u8, y: u8) *const u8 {\n const result = x + y;\n return &result;\n}\n```\n:::\n\n\n\n\nThis \"invalid pointer to stack variable\" problem is very known across many programming language communities.\nIf you try to do the same thing, for example, in a C or C++ program (i.e. returning an address to\na local object stored in the stack), you would also get undefined behaviour\nin the program.\n\n::: {.callout-important}\nIf a local object in your function is stored in the stack, you should never\nreturn a pointer to this local object from the function. Because\nthis pointer will always become undefined after the function returns, since the stack space of the function\nis destroyed at the end of its scope.\n:::\n\nBut what if you really need to use this local object in some way after your function returns?\nHow can you do this? The answer is: \"in the same you would do if this was a C or C++ program. By returning\nan address to an object stored in the heap\". The heap memory have a much more flexible lifecycle,\nand allows you to get a valid pointer to a local object of a function that already returned\nfrom its scope.\n\n\n### Heap {#sec-heap}\n\nOne important limitation of the stack, is that, only objects whose length/size is known at compile-time can be\nstored in it. In contrast, the heap is a much more dynamic\n(and flexible) type of memory. It is the perfect type of memory to use\non objects whose size/length might grow during the execution of your program.\n\nVirtually any application that behaves as a server is a classic use case of the heap.\nA HTTP server, a SSH server, a DNS server, a LSP server, ... any type of server.\nIn summary, a server is a type of application that runs for long periods of time,\nand that serves (or \"deals with\") any incoming request that reaches this particular server.\n\nThe heap is a good choice for this type of system, mainly because the server does not know upfront\nhow many requests it will receive from users, while it is active. It could be one single request,\nor, 5 thousand requests, or, it could also be zero requests.\nThe server needs to have the ability to allocate and manage its memory according to how many requests it receives.\n\nAnother key difference between the stack and the heap, is that the heap is a type\nof memory that you, the programmer, have complete control over. This makes the heap a\nmore flexible type of memory, but it also makes it harder to work with it. Because you,\nthe programmer, is responsible for managing everything related to it. Including where the memory is allocated,\nhow much memory is allocated, and where this memory is freed.\n\n> Unlike stack memory, heap memory is allocated explicitly by programmers and it won’t be deallocated until it is explicitly freed [@jenny2022].\n\nTo store an object in the heap, you, the programmer, needs to explicitly tells Zig to do so,\nby using an allocator to allocate some space in the heap. At @sec-allocators, I will present how you can use allocators to allocate memory\nin Zig.\n\n::: {.callout-important}\nEvery memory you allocate in the heap needs to be explicitly freed by you, the programmer.\n:::\n\nThe majority of allocators in Zig do allocate memory on the heap. But some exceptions to this rule are\n`ArenaAllocator()` and `FixedBufferAllocator()`. The `ArenaAllocator()` is a special\ntype of allocator that works in conjunction with a second type of allocator.\nOn the other side, the `FixedBufferAllocator()` is an allocator that works based on\nbuffer objects created on the stack. This means that the `FixedBufferAllocator()` makes\nallocations only on the stack.\n\n\n\n\n### Summary\n\nAfter discussing all of these boring details, we can quickly recap what we learned.\nIn summary, the Zig compiler will use the following rules to decide where each\nobject you declare is stored:\n\n1. every literal value (such as `\"this is string\"`, `10`, or `true`) is stored in the global data section.\n1. every constant object (`const`) whose value **is known at compile-time** is also stored in the global data section.\n1. every object (constant or not) whose length/size **is known at compile time** is stored in the stack space for the current scope.\n1. if an object is created with the method `alloc()` or `create()` of an allocator object, this object is stored in the memory space used by this particular allocator object. Most of allocators available in Zig use the heap memory, so, this object is likely stored in the heap (`FixedBufferAllocator()` is an exception to that).\n1. the heap can only be accessed through allocators. If your object was not created through the `alloc()` or `create()` methods of an allocator object, then, he is most certainly not an object stored in the heap.\n\n\n## Allocators {#sec-allocators}\n\nOne key aspect about Zig, is that there are \"no hidden-memory allocations\" in Zig.\nWhat that really means, is that \"no allocations happen behind your back in the standard library\" [@zigguide].\n\nThis is a known problem, especially in C++. Because in C++, there are some operators that do allocate\nmemory behind the scene, and there is no way for you to known that, until you actually read the\nsource code of these operators, and find the memory allocation calls.\nMany programmers find this behaviour annoying and hard to keep track of.\n\nBut, in Zig, if a function, an operator, or anything from the standard library\nneeds to allocate some memory during its execution, then, this function/operator needs to receive (as input) an allocator\nprovided by the user, to actually be able to allocate the memory it needs.\n\nThis creates a clear distinction between functions that \"do not\" from those that \"actually do\"\nallocate memory. Just look at the arguments of this function.\nIf a function, or operator, have an allocator object as one of its inputs/arguments, then, you know for\nsure that this function/operator will allocate some memory during its execution.\n\nAn example is the `allocPrint()` function from the Zig standard library. With this function, you can\nwrite a new string using format specifiers. So, this function is, for example, very similar to the function `sprintf()` in C.\nIn order to write such new string, the `allocPrint()` function needs to allocate some memory to store the\noutput string.\n\nThat is why, the first argument of this function is an allocator object that you, the user/programmer, gives\nas input to the function. In the example below, I am using the `GeneralPurposeAllocator()` as my allocator\nobject. But I could easily use any other type of allocator object from the Zig standard library.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nconst allocator = gpa.allocator();\nconst name = \"Pedro\";\nconst output = try std.fmt.allocPrint(\n allocator,\n \"Hello {s}!!!\",\n .{name}\n);\ntry stdout.print(\"{s}\\n\", .{output});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHello Pedro!!!\n```\n\n\n:::\n:::\n\n\n\n\n\nYou get a lot of control\nover where and how much memory this function can allocate. Because it is you,\nthe user/programmer, that provides the allocator for the function to use.\nThis makes \"total control\" over memory management easier to achieve in Zig.\n\n### What are allocators?\n\nAllocators in Zig are objects that you can use to allocate memory for your program.\nThey are similar to the memory allocating functions in C, like `malloc()` and `calloc()`.\nSo, if you need to use more memory than you initially have, during the execution of your program, you can simply ask\nfor more memory using an allocator.\n\nZig offers different types of allocators, and they are usually available through the `std.heap` module of\nthe standard library. So, just import the Zig standard library into your Zig module (with `@import(\"std\")`), and you can start\nusing these allocators in your code.\n\nFurthermore, every allocator object is built on top of the `Allocator` interface in Zig. This\nmeans that, every allocator object you find in Zig must have the methods `alloc()`,\n`create()`, `free()` and `destroy()`. So, you can change the type of allocator you are using,\nbut you don't need to change the function calls to the methods that do the memory allocation\n(and the free memory operations) for your program.\n\n### Why you need an allocator?\n\nAs we described at @sec-stack, everytime you make a function call in Zig,\na space in the stack is reserved for this function call. But the stack\nhave a key limitation which is: every object stored in the stack have a\nknown fixed length.\n\nBut in reality, there are two very common instances where this \"fixed length limitation\" of the stack is a deal braker:\n\n1. the objects that you create inside your function might grow in size during the execution of the function.\n1. sometimes, it is impossible to know upfront how many inputs you will receive, or how big this input will be.\n\nAlso, there is another instance where you might want to use an allocator, which is when you want to write a function that returns a pointer\nto a local object. As I described at @sec-stack, you cannot do that if this local object is stored in the\nstack. However, if this object is stored in the heap, then, you can return a pointer to this object at the\nend of the function. Because you (the programmer) control the lifetime of any heap memory that you allocate. You decide\nwhen this memory get's destroyed/freed.\n\nThese are common situations where the stack is not good for.\nThat is why you need a different memory management strategy to\nstore these objects inside your function. You need to use\na memory type that can grow together with your objects, or that you\ncan control the lifetime of this memory.\nThe heap fit this description.\n\nAllocating memory on the heap is commonly known as dynamic memory management. As the objects you create grow in size\nduring the execution of your program, you grow the amount of memory\nyou have by allocating more memory in the heap to store these objects. \nAnd you that in Zig, by using an allocator object.\n\n\n### The different types of allocators\n\n\nAt the moment of the writing of this book, in Zig, we have 6 different\nallocators available in the standard library:\n\n- `GeneralPurposeAllocator()`.\n- `page_allocator()`.\n- `FixedBufferAllocator()` and `ThreadSafeFixedBufferAllocator()`.\n- `ArenaAllocator()`.\n- `c_allocator()` (requires you to link to libc).\n\n\nEach allocator have its own perks and limitations. All allocators, except `FixedBufferAllocator()` and `ArenaAllocator()`,\nare allocators that use the heap memory. So any memory that you allocate with\nthese allocators, will be placed in the heap.\n\n### General-purpose allocators\n\nThe `GeneralPurposeAllocator()`, as the name suggests, is a \"general purpose\" allocator. You can use it for every type\nof task. In the example below, I'm allocating enough space to store a single integer in the object `some_number`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n const some_number = try allocator.create(u32);\n defer allocator.destroy(some_number);\n\n some_number.* = @as(u32, 45);\n}\n```\n:::\n\n\n\n\n\nWhile useful, you might want to use the `c_allocator()`, which is a alias to the C standard allocator `malloc()`. So, yes, you can use\n`malloc()` in Zig if you want to. Just use the `c_allocator()` from the Zig standard library. However,\nif you do use `c_allocator()`, you must link to Libc when compiling your source code with the\n`zig` compiler, by including the flag `-lc` in your compilation process.\nIf you do not link your source code to Libc, Zig will not be able to find the\n`malloc()` implementation in your system.\n\n### Page allocator\n\nThe `page_allocator()` is an allocator that allocates full pages of memory in the heap. In other words,\nevery time you allocate memory with `page_allocator()`, a full page of memory in the heap is allocated,\ninstead of just a small piece of it.\n\nThe size of this page depends on the system you are using.\nMost systems use a page size of 4KB in the heap, so, that is the amount of memory that is normally\nallocated in each call by `page_allocator()`. That is why, `page_allocator()` is considered a\nfast, but also \"wasteful\" allocator in Zig. Because it allocates a big amount of memory\nin each call, and you most likely will not need that much memory in your program.\n\n### Buffer allocators\n\nThe `FixedBufferAllocator()` and `ThreadSafeFixedBufferAllocator()` are allocator objects that\nwork with a fixed sized buffer that is stored in the stack. So these two allocators only allocates\nmemory in the stack. This also means that, in order to use these allocators, you must first\ncreate a buffer object, and then, give this buffer as an input to these allocators.\n\nIn the example below, I am creating a `buffer` object that is 10 elements long.\nNotice that I give this `buffer` object to the `FixedBufferAllocator()` constructor.\nNow, because this `buffer` object is 10 elements long, this means that I am limited to this space.\nI cannot allocate more than 10 elements with this allocator object. If I try to\nallocate more than that, the `alloc()` method will return an `OutOfMemory` error value.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar buffer: [10]u8 = undefined;\nfor (0..buffer.len) |i| {\n buffer[i] = 0; // Initialize to zero\n}\n\nvar fba = std.heap.FixedBufferAllocator.init(&buffer);\nconst allocator = fba.allocator();\nconst input = try allocator.alloc(u8, 5);\ndefer allocator.free(input);\n```\n:::\n\n\n\n\n\n### Arena allocator {#sec-arena-allocator}\n\nThe `ArenaAllocator()` is an allocator object that takes a child allocator as input. The idea behind the `ArenaAllocator()` in Zig\nis similar to the concept of \"arenas\" in the programming language Go[^go-arena]. It is an allocator object that allows you\nto allocate memory as many times you want, but free all memory only once.\nIn other words, if you have, for example, called 5 times the method `alloc()` of an `ArenaAllocator()` object, you can\nfree all the memory you allocated over these 5 calls at once, by simply calling the `deinit()` method of the same `ArenaAllocator()` object.\n\n[^go-arena]: \n\nIf you give, for example, a `GeneralPurposeAllocator()` object as input to the `ArenaAllocator()` constructor, like in the example below, then, the allocations\nyou perform with `alloc()` will actually be made with the underlying object `GeneralPurposeAllocator()` that was passed.\nSo, with an arena allocator, any new memory you ask for is allocated by the child allocator. The only thing that an arena allocator\nreally do is helping you to free all the memory you allocated multiple times with just a single command. In the example\nbelow, I called `alloc()` 3 times. So, if I did not used an arena allocator, then, I would need to call\n`free()` 3 times to free all the allocated memory.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nvar aa = std.heap.ArenaAllocator.init(gpa.allocator());\ndefer aa.deinit();\nconst allocator = aa.allocator();\n\nconst in1 = allocator.alloc(u8, 5);\nconst in2 = allocator.alloc(u8, 10);\nconst in3 = allocator.alloc(u8, 15);\n_ = in1; _ = in2; _ = in3;\n```\n:::\n\n\n\n\n\n\n### The `alloc()` and `free()` methods\n\nIn the code example below, we are accessing the `stdin`, which is\nthe standard input channel, to receive an input from the\nuser. We read the input given by the user with the `readUntilDelimiterOrEof()`\nmethod.\n\nNow, after reading the input of the user, we need to store this input somewhere in\nour program. That is why I use an allocator in this example. I use it to allocate some\namount of memory to store this input given by the user. More specifically, the method `alloc()`\nof the allocator object is used to allocate an array capable of storing 50 `u8` values.\n\nNotice that this `alloc()` method receives two inputs. The first one, is a type.\nThis defines what type of values the allocated array will store. In the example\nbelow, we are allocating an array of unsigned 8-bit integers (`u8`). But\nyou can create an array to store any type of value you want. Next, on the second argument, we\ndefine the size of the allocated array, by specifying how much elements\nthis array will contain. In the case below, we are allocating an array of 50 elements.\n\nAt @sec-zig-strings we described that strings in Zig are simply arrays of characters.\nEach character is represented by an `u8` value. So, this means that the array that\nwas allocated in the object `input` is capable of storing a string that is\n50-characters long.\n\nSo, in essence, the expression `var input: [50]u8 = undefined` would create\nan array for 50 `u8` values in the stack of the current scope. But, you\ncan allocate the same array in the heap by using the expression `var input = try allocator.alloc(u8, 50)`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdin = std.io.getStdIn();\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n var input = try allocator.alloc(u8, 50);\n defer allocator.free(input);\n for (0..input.len) |i| {\n input[i] = 0; // initialize all fields to zero.\n }\n // read user input\n const input_reader = stdin.reader();\n _ = try input_reader.readUntilDelimiterOrEof(\n input,\n '\\n'\n );\n std.debug.print(\"{s}\\n\", .{input});\n}\n```\n:::\n\n\n\n\nAlso, notice that in this example, we use the `defer` keyword (which I described at @sec-defer) to run a small\npiece of code at the end of the current scope, which is the expression `allocator.free(input)`.\nWhen you execute this expression, the allocator will free the memory that it allocated\nfor the `input` object.\n\nWe have talked about this at @sec-heap. You **should always** explicitly free any memory that you allocate\nusing an allocator! You do that by using the `free()` method of the same allocator object you\nused to allocate this memory. The `defer` keyword is used in this example only to help us execute\nthis free operation at the end of the current scope.\n\n\n### The `create()` and `destroy()` methods\n\nWith the `alloc()` and `free()` methods, you can allocate memory to store multiple elements\nat once. In other words, with these methods, we always allocate an array to store multiple elements at once.\nBut what if you need enough space to store just a single item? Should you\nallocate an array of a single element through `alloc()`?\n\nThe answer is no! In this case,\nyou should use the `create()` method of the allocator object.\nEvery allocator object offers the `create()` and `destroy()` methods,\nwhich are used to allocate and free memory for a single item, respectively.\n\nSo, in essence, if you want to allocate memory to store an array of elements, you\nshould use `alloc()` and `free()`. But if you need to store just a single item,\nthen, the `create()` and `destroy()` methods are ideal for you.\n\nIn the example below, I'm defining a struct to represent an user of some sort.\nIt could be an user for a game, or a software to manage resources, it doesn't mater.\nNotice that I use the `create()` method this time, to store a single `User` object\nin the program. Also notice that I use the `destroy()` method to free the memory\nused by this object at the end of the scope.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst User = struct {\n id: usize,\n name: []const u8,\n\n pub fn init(id: usize, name: []const u8) User {\n return .{ .id = id, .name = name };\n }\n};\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n const user = try allocator.create(User);\n defer allocator.destroy(user);\n\n user.* = User.init(0, \"Pedro\");\n}\n```\n:::\n", - "supporting": [], + "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n\n\n\n\n# Memory and Allocators\n\n\nIn this chapter, we will talk about memory. How does Zig controls memory? What\ncommon tools are used? Are there any important aspect that makes memory\ndifferent/special in Zig? You will find the answers here.\n\nComputers fundamentally rely on memory to function. This memory acts as a temporary storage\nspace for the data and values generated during computations. Without memory, the core\nconcepts of \"variables\" and \"objects\" in programming languages would be impossible.\n\n\n\n\n## Memory spaces\n\nEvery object that you create in your Zig source code needs to be stored somewhere,\nin your computer's memory. Depending on where and how you define your object, Zig\nwill use a different \"memory space\", or a different\ntype of memory to store this object.\n\nEach type of memory normally serves for different purposes.\nIn Zig, there are 3 types of memory (or 3 different memory spaces) that we care about. They are:\n\n- Global data register (or the \"global data section\");\n- Stack;\n- Heap;\n\n\n### Compile-time known versus runtime known {#sec-compile-time}\n\nOne strategy that Zig uses to decide where it will store each object that you declare, is by looking\nat the value of this particular object. More specifically, by investigating if this value is\nknown at \"compile-time\" or at \"runtime\".\n\nWhen you write a program in Zig, the values of some of the objects that you write in your program are *known\nat compile time*. Meaning that, when you compile your Zig source code, during the compilation process,\nthe `zig` compiler can figure it out what is the exact value of a particular object\nthat exists in your source code.\nKnowing the length (or the size) of each object is also important. So the length (or the size) of each object that you write in your program is,\nin some cases, *known at compile time*.\n\nThe `zig` compiler cares more about knowing the length (or the size) of a particular object\n, than to know its actual value. But, if the `zig` compiler knows the value of the object, then, it\nautomatically knows the size of this object. Because it can simply calculate the\nsize of the object by looking at the size of the value.\n\nTherefore, the priority for the `zig` compiler is to discover the size of each object in your source code.\nIf the value of the object in question is known at compile-time, then, the `zig` compiler\nautomatically knows the size/length of this object. But if the value of this object is not\nknown at compile-time, then, the size of this object is only known at compile-time if,\nand only if, the type of this object have a known fixed size.\n\nIn order to a type have a known fixed size, this type must have data members whose size is fixed.\nIf this type includes, for example, a variable sized array in it, then, this type do not have a known\nfixed size. Because this array can have any size at runtime\n(i.e. it can be an array of 2 elements, or 50 elements, or 1 thousand elements, etc.).\n\nFor example, a string object, which internally is an array of constant u8 values (`[]const u8`)\nhave a variable size. It can be a string object with 100 or 500 characters in it. If we do not\nknow at compile-time, which exact string will be stored inside this string object, then, we cannot calculate\nthe size of this string object at compile-time. So, any type, or any struct declaration that you make, that\nincludes a string data member that do not have an explicit fixed size, makes this type, or this\nnew struct that you are declaring, a type that do not have a known fixed size at compile-time.\n\nIn contrast, if the type or this struct that you are declaring, includes a data member that is an array,\nbut this array have a known fixed size, like `[60]u8` (which declares an array of 60 `u8` values), then,\nthis type, or, this struct that you are declaring, becomes a type with a known fixed size at compile-time.\nAnd because of that, in this case, the `zig` compiler do not need to known at compile-time the exact value of\nany object of this type. Since the compiler can find the necessary size to store this object by\nlooking at the size of its type.\n\n\nLet's look at an example. In the source code below, we have two constant objects (`name` and `array`) declared.\nBecause the values of these particular objects are written down, in the source code itself (`\"Pedro\"`\nand the number sequence from 1 to 4), the `zig` compiler can easily discover the values of these constant\nobjects (`name` and `array`) during the compilation process.\nThis is what \"known at compile time\" means. It refers to any object that you have in your Zig source code\nwhose value can be identified at compile time.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst name = \"Pedro\";\nconst array = [_]u8{1, 2, 3, 4};\n_ = name; _ = array;\n\nfn input_length(input: []const u8) usize {\n const n = input.len;\n return n;\n}\n```\n:::\n\n\n\n\nThe other side of the spectrum are objects whose values are not known at compile time.\nFunction arguments are a classic example of this. Because the value of each function\nargument depends on the value that you assign to this particular argument,\nwhen you call the function.\n\nFor example, the function `input_length()` contains an argument named `input`, which is an array of constant `u8` integers (`[]const u8`).\nIs impossible to know at compile time the value of this particular argument. And it also is impossible to know the size/length\nof this particular argument. Because it is an array that do not have a fixed size specified explicitly in the argument type annotation.\n\nSo, we know that this `input` argument will be an array of `u8` integers. But we do not know at compile-time, its value, and neither his size.\nThis information is known only at runtime, which is the period of time when you program is executed.\nAs a consequence, the value of the expression `input.len` is also known only at runtime.\nThis is an intrinsic characteristic of any function. Just remember that the value of function arguments is usually not \"compile-time known\".\n\nHowever, as I mentioned earlier, what really matters to the compiler is to know the size of the object\nat compile-time, and not necessarily its value. So, although we don't know the value of the object `n`, which is the result of the expression\n`input.len`, at compile-time, we do know its size. Because the expression `input.len` always return a value of type `usize`,\nand the type `usize` have a known fixed size.\n\n\n\n### Global data register\n\nThe global data register is a specific section of the executable of your Zig program, that is responsible\nfor storing any value that is known at compile time.\n\nEvery constant object whose value is known at compile time that you declare in your source code,\nis stored in the global data register. Also, every literal value that you write in your source code,\nsuch as the string `\"this is a string\"`, or the integer `10`, or a boolean value such as `true`,\nis also stored in the global data register.\n\nHonestly, you don't need to care much about this memory space. Because you can't control it,\nyou can't deliberately access it or use it for your own purposes.\nAlso, this memory space does not affect the logic of your program.\nIt simply exists in your program.\n\n\n### Stack vs Heap\n\nIf you are familiar with system's programming, or just low-level programming in general, you\nprobably have heard of the \"duel\" between Stack vs Heap. These are two different types of memory,\nor different memory spaces, which are both available in Zig.\n\nThese two types of memory don't actually duel with\neach other. This is a common mistake that beginners have, when seeing \"x vs y\" styles of\ntabloid headlines. These two types of memory are actually complementary to each other.\nSo, in almost every Zig program that you ever write, you will likely use a combination of both.\nI will describe each memory space in detail over the next sections. But for now, I just want to\nstablish the main difference between these two types of memory.\n\nIn essence, the stack memory is normally used to store values whose length is fixed and known\nat compile time. In contrast, the heap memory is a *dynamic* type of memory space, meaning that, it is\nused to store values whose length might grow during the execution (runtime) of your program [@jenny2022].\n\nLengths that grow during runtime are intrinsically associated with \"runtime known\" type of values.\nIn other words, if you have an object whose length might grow during runtime, then, the length\nof this object becomes not known at compile time. If the length is not known at compile-time,\nthe value of this object also becomes not known at compile-time.\nThese types of objects should be stored in the heap memory space, which is\na dynamic memory space, which can grow or shrink to fit the size of your objects.\n\n\n\n### Stack {#sec-stack}\n\nThe stack is a type of memory that uses the power of the *stack data structure*, hence the name. \nA \"stack\" is a type of *data structure* that uses a \"last in, first out\" (LIFO) mechanism to store the values\nyou give it to. I imagine you are familiar with this data structure.\nBut, if you are not, the [Wikipedia page](https://en.wikipedia.org/wiki/Stack_(abstract_data_type))[^wiki-stack]\n, or, the [Geeks For Geeks page](https://www.geeksforgeeks.org/stack-data-structure/)[^geek-stack] are both\nexcellent and easy resources to fully understand how this data structure works.\n\n[^wiki-stack]: \n[^geek-stack]: \n\nSo, the stack memory space is a type of memory that stores values using a stack data structure.\nIt adds and removes values from the memory by following a \"last in, first out\" (LIFO) principle.\n\nEvery time you make a function call in Zig, an amount of space in the stack is\nreserved for this particular function call [@jenny2022; @zigdocs].\nThe value of each function argument given to the function in this function call is stored in this\nstack space. Also, every local object that you declare inside the function scope is\nusually stored in this same stack space.\n\n\nLooking at the example below, the object `result` is a local object declared inside the scope of the `add()`\nfunction. Because of that, this object is stored inside the stack space reserved for the `add()` function.\nThe `r` object (which is declared outside of the `add()` function scope) is also stored in the stack.\nBut since it is declared in the \"outer\" scope, this object is stored in the\nstack space that belongs to this outer scope.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst r = add(5, 27);\n_ = r;\n\nfn add(x: u8, y: u8) u8 {\n const result = x + y;\n return result;\n}\n```\n:::\n\n\n\n\n\nSo, any object that you declare inside the scope of a function is always stored inside\nthe space that was reserved for that particular function in the stack memory. This\nalso counts for any object declared inside the scope of your `main()` function for example.\nAs you would expect, in this case, they\nare stored inside the stack space reserved for the `main()` function.\n\nOne very important detail about the stack memory is that **it frees itself automatically**.\nThis is very important, remember that. When objects are stored in the stack memory,\nyou don't have the work (or the responsibility) of freeing/destroying these objects.\nBecause they will be automatically destroyed once the stack space is freed at the end of the function scope.\n\nSo, once the function call returns (or ends, if you prefer to call it this way)\nthe space that was reserved in the stack is destroyed, and all of the objects that were in that space goes away with it.\nThis mechanism exists because this space, and the objects within it, are not necessary anymore,\nsince the function \"finished its business\".\nUsing the `add()` function that we exposed above as an example, it means that the object `result` is automatically\ndestroyed once the function returns.\n\n::: {.callout-important}\nLocal objects that are stored in the stack space of a function are automatically\nfreed/destroyed at the end of the function scope.\n:::\n\n\nThis same logic applies to any other special structure in Zig that have its own scope by surrounding\nit with curly braces (`{}`).\nFor loops, while loops, if else statements, etc. For example, if you declare any local\nobject in the scope of a for loop, this local object is accessible only within the scope\nof this particular for loop. Because once the scope of this for loop ends, the space in the stack\nreserved for this for loop is freed.\nThe example below demonstrates this idea.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// This does not compile successfully!\nconst a = [_]u8{0, 1, 2, 3, 4};\nfor (0..a.len) |i| {\n const index = i;\n _ = index;\n}\n// Trying to use an object that was\n// declared in the for loop scope,\n// and that does not exist anymore.\nstd.debug.print(\"{d}\\n\", index);\n```\n:::\n\n\n\n\n\n\nOne important consequence of this mechanism is that, once the function returns, you can no longer access any memory\naddress that was inside the space in the stack reserved for this particular function. Because this space was\ndestroyed. This means that, if this local object is stored in the stack,\nyou cannot make a function that **returns a pointer to this object**.\n\nThink about that for a second. If all local objects in the stack are destroyed at the end of the function scope, why\nwould you even consider returning a pointer to one of these objects? This pointer is at best,\ninvalid, or, more likely, \"undefined\".\n\nConclusion, is totally fine to write a function that returns the local object\nitself as result, because then, you return the value of that object as the result.\nBut, if this local object is stored in the stack, you should never write a function\nthat returns a pointer to this local object. Because the memory address pointed by the pointer\nno longer exists.\n\n\nSo, using again the `add()` function as an example, if you rewrite this function so that it\nreturns a pointer to the local object `result`, the `zig` compiler will actually compile\nyou program, with no warnings or erros. At first glance, it looks that this is good code\nthat works as expected. But this is a lie!\n\nIf you try to take a look at the value inside of the `r` object,\nor, if you try to use this `r` object in another expression\nor function call, then, you would have undefined behaviour, and major\nbugs in your program [@zigdocs, see \"Lifetime and Ownership\"[^life] and \"Undefined Behaviour\"[^undef] sections].\n\n[^life]: \n[^undef]: \n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// This code compiles successfully. But it has\n// undefined behaviour. Never do this!!!\n\n// The `r` object is undefined!\nconst r = add(5, 27);\n_ = r;\n\nfn add(x: u8, y: u8) *const u8 {\n const result = x + y;\n return &result;\n}\n```\n:::\n\n\n\n\nThis \"invalid pointer to stack variable\" problem is very known across many programming language communities.\nIf you try to do the same thing, for example, in a C or C++ program (i.e. returning an address to\na local object stored in the stack), you would also get undefined behaviour\nin the program.\n\n::: {.callout-important}\nIf a local object in your function is stored in the stack, you should never\nreturn a pointer to this local object from the function. Because\nthis pointer will always become undefined after the function returns, since the stack space of the function\nis destroyed at the end of its scope.\n:::\n\nBut what if you really need to use this local object in some way after your function returns?\nHow can you do this? The answer is: \"in the same you would do if this was a C or C++ program. By returning\nan address to an object stored in the heap\". The heap memory have a much more flexible lifecycle,\nand allows you to get a valid pointer to a local object of a function that already returned\nfrom its scope.\n\n\n### Heap {#sec-heap}\n\nOne important limitation of the stack, is that, only objects whose length/size is known at compile-time can be\nstored in it. In contrast, the heap is a much more dynamic\n(and flexible) type of memory. It is the perfect type of memory to use\non objects whose size/length might grow during the execution of your program.\n\nVirtually any application that behaves as a server is a classic use case of the heap.\nA HTTP server, a SSH server, a DNS server, a LSP server, ... any type of server.\nIn summary, a server is a type of application that runs for long periods of time,\nand that serves (or \"deals with\") any incoming request that reaches this particular server.\n\nThe heap is a good choice for this type of system, mainly because the server does not know upfront\nhow many requests it will receive from users, while it is active. It could be one single request,\nor, 5 thousand requests, or, it could also be zero requests.\nThe server needs to have the ability to allocate and manage its memory according to how many requests it receives.\n\nAnother key difference between the stack and the heap, is that the heap is a type\nof memory that you, the programmer, have complete control over. This makes the heap a\nmore flexible type of memory, but it also makes it harder to work with it. Because you,\nthe programmer, is responsible for managing everything related to it. Including where the memory is allocated,\nhow much memory is allocated, and where this memory is freed.\n\n> Unlike stack memory, heap memory is allocated explicitly by programmers and it won’t be deallocated until it is explicitly freed [@jenny2022].\n\nTo store an object in the heap, you, the programmer, needs to explicitly tells Zig to do so,\nby using an allocator to allocate some space in the heap. At @sec-allocators, I will present how you can use allocators to allocate memory\nin Zig.\n\n::: {.callout-important}\nEvery memory you allocate in the heap needs to be explicitly freed by you, the programmer.\n:::\n\nThe majority of allocators in Zig do allocate memory on the heap. But some exceptions to this rule are\n`ArenaAllocator()` and `FixedBufferAllocator()`. The `ArenaAllocator()` is a special\ntype of allocator that works in conjunction with a second type of allocator.\nOn the other side, the `FixedBufferAllocator()` is an allocator that works based on\nbuffer objects created on the stack. This means that the `FixedBufferAllocator()` makes\nallocations only on the stack.\n\n\n\n\n### Summary\n\nAfter discussing all of these boring details, we can quickly recap what we learned.\nIn summary, the Zig compiler will use the following rules to decide where each\nobject you declare is stored:\n\n1. every literal value (such as `\"this is string\"`, `10`, or `true`) is stored in the global data section.\n1. every constant object (`const`) whose value **is known at compile-time** is also stored in the global data section.\n1. every object (constant or not) whose length/size **is known at compile time** is stored in the stack space for the current scope.\n1. if an object is created with the method `alloc()` or `create()` of an allocator object, this object is stored in the memory space used by this particular allocator object. Most of allocators available in Zig use the heap memory, so, this object is likely stored in the heap (`FixedBufferAllocator()` is an exception to that).\n1. the heap can only be accessed through allocators. If your object was not created through the `alloc()` or `create()` methods of an allocator object, then, he is most certainly not an object stored in the heap.\n\n\n## Stack overflows {#sec-stack-overflow}\n\nAllocating memory on the stack is generally faster than allocating it on the heap.\nBut this better performance comes with many restrictions. We have already discussed\nmany of these restrictions of the stack at @sec-stack. But there is one more important\nlimitation that I want to talk about, which is the size of the stack itself.\n\nThe stack is limited in size. This size varies from computer to computer, and it depends on\na lot of things (the computer architecture, the operating system, etc.). Nevertheless, this size is usually\nnot that big. This is why we normally use the stack to store only temporary and small objects in memory.\n\nIn essence, if you try to make an allocation on the stack, that is so big that exceeds the stack size limit,\na *stack overflow* happens, and your program just crashes as a result of that. In other words, a stack overflow happens when\nyou attempt to use more space than is available on the stack.\n\nThis type of problem is very similar to a *buffer overflow*, i.e. you are trying to use more space\nthan is available in the \"buffer object\". However, a stack overflow always cause your program to crash,\nwhile a buffer overflow not always cause your program to crash (although it often does).\n\nYou can see an example of a stack overflow in the example below. We are trying to allocate a very big array of `u64` values\non the stack. You can see below that this program does not run succesfully, because it crashed\nwith a \"segmentation fault\" error message.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar very_big_alloc: [1000 * 1000 * 24]u64 = undefined;\n@memset(very_big_alloc[0..], 0);\n```\n:::\n\n\n\n\n```\nSegmentation fault (core dumped)\n```\n\nThis segmentation fault error is a result of the stack overflow that was caused by the big\nmemory allocation made on the stack, to store the `very_big_alloc` object.\nThis is why very big objects are usually stored on the heap, instead of the stack.\n\n\n\n## Allocators {#sec-allocators}\n\nOne key aspect about Zig, is that there are \"no hidden-memory allocations\" in Zig.\nWhat that really means, is that \"no allocations happen behind your back in the standard library\" [@zigguide].\n\nThis is a known problem, especially in C++. Because in C++, there are some operators that do allocate\nmemory behind the scene, and there is no way for you to known that, until you actually read the\nsource code of these operators, and find the memory allocation calls.\nMany programmers find this behaviour annoying and hard to keep track of.\n\nBut, in Zig, if a function, an operator, or anything from the standard library\nneeds to allocate some memory during its execution, then, this function/operator needs to receive (as input) an allocator\nprovided by the user, to actually be able to allocate the memory it needs.\n\nThis creates a clear distinction between functions that \"do not\" from those that \"actually do\"\nallocate memory. Just look at the arguments of this function.\nIf a function, or operator, have an allocator object as one of its inputs/arguments, then, you know for\nsure that this function/operator will allocate some memory during its execution.\n\nAn example is the `allocPrint()` function from the Zig Standard Library. With this function, you can\nwrite a new string using format specifiers. So, this function is, for example, very similar to the function `sprintf()` in C.\nIn order to write such new string, the `allocPrint()` function needs to allocate some memory to store the\noutput string.\n\nThat is why, the first argument of this function is an allocator object that you, the user/programmer, gives\nas input to the function. In the example below, I am using the `GeneralPurposeAllocator()` as my allocator\nobject. But I could easily use any other type of allocator object from the Zig Standard Library.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nconst allocator = gpa.allocator();\nconst name = \"Pedro\";\nconst output = try std.fmt.allocPrint(\n allocator,\n \"Hello {s}!!!\",\n .{name}\n);\ntry stdout.print(\"{s}\\n\", .{output});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHello Pedro!!!\n```\n\n\n:::\n:::\n\n\n\n\n\nYou get a lot of control over where and how much memory this function can allocate.\nBecause it is you, the user/programmer, that provides the allocator for the function to use.\nThis makes \"total control\" over memory management easier to achieve in Zig.\n\n### What are allocators?\n\nAllocators in Zig are objects that you can use to allocate memory for your program.\nThey are similar to the memory allocating functions in C, like `malloc()` and `calloc()`.\nSo, if you need to use more memory than you initially have, during the execution of your program, you can simply ask\nfor more memory by using an allocator object.\n\nZig offers different types of allocators, and they are usually available through the `std.heap` module of\nthe standard library. Thus, just import the Zig Standard Library into your Zig module (with `@import(\"std\")`), and you can start\nusing these allocators in your code.\n\nFurthermore, every allocator object is built on top of the `Allocator` interface in Zig.\nThis means that, every allocator object you find in Zig must have the methods `alloc()`,\n`create()`, `free()` and `destroy()`. So, you can change the type of allocator you are using,\nbut you don't need to change the function calls to the methods that do the memory allocation\n(and the free memory operations) for your program.\n\n### Why you need an allocator?\n\nAs we described at @sec-stack, everytime you make a function call in Zig,\na space in the stack is reserved for this function call. But the stack\nhave a key limitation which is: every object stored in the stack have a\nknown fixed length.\n\nBut in reality, there are two very common instances where this \"fixed length limitation\" of the stack is a deal braker:\n\n1. the objects that you create inside your function might grow in size during the execution of the function.\n1. sometimes, it is impossible to know upfront how many inputs you will receive, or how big this input will be.\n\nAlso, there is another instance where you might want to use an allocator, which is when you want to write a function that returns a pointer\nto a local object. As I described at @sec-stack, you cannot do that if this local object is stored in the\nstack. However, if this object is stored in the heap, then, you can return a pointer to this object at the\nend of the function. Because you (the programmer) control the lifetime of any heap memory that you allocate. You decide\nwhen this memory get's destroyed/freed.\n\nThese are common situations where the stack is not good for.\nThat is why you need a different memory management strategy to\nstore these objects inside your function. You need to use\na memory type that can grow together with your objects, or that you\ncan control the lifetime of this memory.\nThe heap fit this description.\n\nAllocating memory on the heap is commonly known as dynamic memory management. As the objects you create grow in size\nduring the execution of your program, you grow the amount of memory\nyou have by allocating more memory in the heap to store these objects. \nAnd you do that in Zig, by using an allocator object.\n\n\n### The different types of allocators\n\n\nAt the moment of the writing of this book, in Zig, we have 6 different\nallocators available in the standard library:\n\n- `GeneralPurposeAllocator()`.\n- `page_allocator()`.\n- `FixedBufferAllocator()` and `ThreadSafeFixedBufferAllocator()`.\n- `ArenaAllocator()`.\n- `c_allocator()` (requires you to link to libc).\n\n\nEach allocator have its own perks and limitations. All allocators, except `FixedBufferAllocator()` and `ArenaAllocator()`,\nare allocators that use the heap memory. So any memory that you allocate with\nthese allocators, will be placed in the heap.\n\n### General-purpose allocators\n\nThe `GeneralPurposeAllocator()`, as the name suggests, is a \"general purpose\" allocator. You can use it for every type\nof task. In the example below, I'm allocating enough space to store a single integer in the object `some_number`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n const some_number = try allocator.create(u32);\n defer allocator.destroy(some_number);\n\n some_number.* = @as(u32, 45);\n}\n```\n:::\n\n\n\n\n\nWhile useful, you might want to use the `c_allocator()`, which is a alias to the C standard allocator `malloc()`. So, yes, you can use\n`malloc()` in Zig if you want to. Just use the `c_allocator()` from the Zig standard library. However,\nif you do use `c_allocator()`, you must link to Libc when compiling your source code with the\n`zig` compiler, by including the flag `-lc` in your compilation process.\nIf you do not link your source code to Libc, Zig will not be able to find the\n`malloc()` implementation in your system.\n\n### Page allocator\n\nThe `page_allocator()` is an allocator that allocates full pages of memory in the heap. In other words,\nevery time you allocate memory with `page_allocator()`, a full page of memory in the heap is allocated,\ninstead of just a small piece of it.\n\nThe size of this page depends on the system you are using.\nMost systems use a page size of 4KB in the heap, so, that is the amount of memory that is normally\nallocated in each call by `page_allocator()`. That is why, `page_allocator()` is considered a\nfast, but also \"wasteful\" allocator in Zig. Because it allocates a big amount of memory\nin each call, and you most likely will not need that much memory in your program.\n\n### Buffer allocators\n\nThe `FixedBufferAllocator()` and `ThreadSafeFixedBufferAllocator()` are allocator objects that\nwork with a fixed sized buffer object at the back. In other words, they use a fixed sized buffer\nobject as the basis for the memory. When you ask these allocator objects to allocate some memory for you,\nthey are essentially reserving some amount of space inside this fixed sized buffer object for you to use.\n\nThis means that, in order to use these allocators, you must first create a buffer object in your code,\nand then, give this buffer object as an input to these allocators.\n\nThis also means that, these allocator objects can allocate memory both in the stack or in the heap.\nEverything depends on where the buffer object that you provide lives. If this buffer object lives\nin the stack, then, the memory allocated is \"stack-based\". But if it lives on the heap, then,\nthe memory allocated is \"heap-based\".\n\n\nIn the example below, I'm creating a `buffer` object on the stack that is 10 elements long.\nNotice that I give this `buffer` object to the `FixedBufferAllocator()` constructor.\nNow, because this `buffer` object is 10 elements long, this means that I am limited to this space.\nI cannot allocate more than 10 elements with this allocator object. If I try to\nallocate more than that, the `alloc()` method will return an `OutOfMemory` error value.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar buffer: [10]u8 = undefined;\nfor (0..buffer.len) |i| {\n buffer[i] = 0; // Initialize to zero\n}\n\nvar fba = std.heap.FixedBufferAllocator.init(&buffer);\nconst allocator = fba.allocator();\nconst input = try allocator.alloc(u8, 5);\ndefer allocator.free(input);\n```\n:::\n\n\n\n\nRemember, the memory allocated by these allocator objects can be either from\nthe stack, or, from the heap. It all depends on where the buffer object that you provide lives.\nIn the above example, the `buffer` object lives in the stack, and, therefore, the memory allocated\nis based in the stack. But what if it was based on the heap?\n\nAs we described at @sec-stack-overflow, one of the main reasons why you would use the heap,\ninstead of the stack, is to allocate huge amounts of space to store very big objects.\nThus, let's suppose you wanted to use a very big buffer object as the basis for your\nallocator objects. You would have to allocate this very big buffer object on the heap.\nThe example below demonstrates this case.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst heap = std.heap.page_allocator;\nconst memory_buffer = try heap.alloc(\n u8, 100 * 1024 * 1024 // 100 MB memory\n);\ndefer heap.free(memory_buffer);\nvar fba = std.heap.FixedBufferAllocator.init(\n memory_buffer\n);\nconst allocator = fba.allocator();\n\nconst input = try allocator.alloc(u8, 1000);\ndefer allocator.free(input);\n```\n:::\n\n\n\n\n\n\n### Arena allocator {#sec-arena-allocator}\n\nThe `ArenaAllocator()` is an allocator object that takes a child allocator as input. The idea behind the `ArenaAllocator()` in Zig\nis similar to the concept of \"arenas\" in the programming language Go[^go-arena]. It is an allocator object that allows you\nto allocate memory as many times you want, but free all memory only once.\nIn other words, if you have, for example, called 5 times the method `alloc()` of an `ArenaAllocator()` object, you can\nfree all the memory you allocated over these 5 calls at once, by simply calling the `deinit()` method of the same `ArenaAllocator()` object.\n\n[^go-arena]: \n\nIf you give, for example, a `GeneralPurposeAllocator()` object as input to the `ArenaAllocator()` constructor, like in the example below, then, the allocations\nyou perform with `alloc()` will actually be made with the underlying object `GeneralPurposeAllocator()` that was passed.\nSo, with an arena allocator, any new memory you ask for is allocated by the child allocator. The only thing that an arena allocator\nreally do is helping you to free all the memory you allocated multiple times with just a single command. In the example\nbelow, I called `alloc()` 3 times. So, if I did not used an arena allocator, then, I would need to call\n`free()` 3 times to free all the allocated memory.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nvar aa = std.heap.ArenaAllocator.init(gpa.allocator());\ndefer aa.deinit();\nconst allocator = aa.allocator();\n\nconst in1 = allocator.alloc(u8, 5);\nconst in2 = allocator.alloc(u8, 10);\nconst in3 = allocator.alloc(u8, 15);\n_ = in1; _ = in2; _ = in3;\n```\n:::\n\n\n\n\n\n\n### The `alloc()` and `free()` methods\n\nIn the code example below, we are accessing the `stdin`, which is\nthe standard input channel, to receive an input from the\nuser. We read the input given by the user with the `readUntilDelimiterOrEof()`\nmethod.\n\nNow, after reading the input of the user, we need to store this input somewhere in\nour program. That is why I use an allocator in this example. I use it to allocate some\namount of memory to store this input given by the user. More specifically, the method `alloc()`\nof the allocator object is used to allocate an array capable of storing 50 `u8` values.\n\nNotice that this `alloc()` method receives two inputs. The first one, is a type.\nThis defines what type of values the allocated array will store. In the example\nbelow, we are allocating an array of unsigned 8-bit integers (`u8`). But\nyou can create an array to store any type of value you want. Next, on the second argument, we\ndefine the size of the allocated array, by specifying how much elements\nthis array will contain. In the case below, we are allocating an array of 50 elements.\n\nAt @sec-zig-strings we described that strings in Zig are simply arrays of characters.\nEach character is represented by an `u8` value. So, this means that the array that\nwas allocated in the object `input` is capable of storing a string that is\n50-characters long.\n\nSo, in essence, the expression `var input: [50]u8 = undefined` would create\nan array for 50 `u8` values in the stack of the current scope. But, you\ncan allocate the same array in the heap by using the expression `var input = try allocator.alloc(u8, 50)`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdin = std.io.getStdIn();\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n var input = try allocator.alloc(u8, 50);\n defer allocator.free(input);\n for (0..input.len) |i| {\n input[i] = 0; // initialize all fields to zero.\n }\n // read user input\n const input_reader = stdin.reader();\n _ = try input_reader.readUntilDelimiterOrEof(\n input,\n '\\n'\n );\n std.debug.print(\"{s}\\n\", .{input});\n}\n```\n:::\n\n\n\n\nAlso, notice that in this example, we use the `defer` keyword (which I described at @sec-defer) to run a small\npiece of code at the end of the current scope, which is the expression `allocator.free(input)`.\nWhen you execute this expression, the allocator will free the memory that it allocated\nfor the `input` object.\n\nWe have talked about this at @sec-heap. You **should always** explicitly free any memory that you allocate\nusing an allocator! You do that by using the `free()` method of the same allocator object you\nused to allocate this memory. The `defer` keyword is used in this example only to help us execute\nthis free operation at the end of the current scope.\n\n\n### The `create()` and `destroy()` methods\n\nWith the `alloc()` and `free()` methods, you can allocate memory to store multiple elements\nat once. In other words, with these methods, we always allocate an array to store multiple elements at once.\nBut what if you need enough space to store just a single item? Should you\nallocate an array of a single element through `alloc()`?\n\nThe answer is no! In this case,\nyou should use the `create()` method of the allocator object.\nEvery allocator object offers the `create()` and `destroy()` methods,\nwhich are used to allocate and free memory for a single item, respectively.\n\nSo, in essence, if you want to allocate memory to store an array of elements, you\nshould use `alloc()` and `free()`. But if you need to store just a single item,\nthen, the `create()` and `destroy()` methods are ideal for you.\n\nIn the example below, I'm defining a struct to represent an user of some sort.\nIt could be an user for a game, or a software to manage resources, it doesn't mater.\nNotice that I use the `create()` method this time, to store a single `User` object\nin the program. Also notice that I use the `destroy()` method to free the memory\nused by this object at the end of the scope.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst User = struct {\n id: usize,\n name: []const u8,\n\n pub fn init(id: usize, name: []const u8) User {\n return .{ .id = id, .name = name };\n }\n};\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n const user = try allocator.create(User);\n defer allocator.destroy(user);\n\n user.* = User.init(0, \"Pedro\");\n}\n```\n:::\n", + "supporting": [ + "01-memory_files" + ], "filters": [ "rmarkdown/pagebreak.lua" ], diff --git a/_freeze/Chapters/01-zig-weird/execute-results/html.json b/_freeze/Chapters/01-zig-weird/execute-results/html.json index b307369..b08ef3b 100644 --- a/_freeze/Chapters/01-zig-weird/execute-results/html.json +++ b/_freeze/Chapters/01-zig-weird/execute-results/html.json @@ -1,9 +1,11 @@ { - "hash": "0481fd2f2b3006a17720b9c366a1cb97", + "hash": "e4259ed694d17eb32e554b9fa21edaa6", "result": { "engine": "knitr", - "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n\n\n\n\n\n# Introducing Zig\n\nIn this chapter, I want to introduce you to the world of Zig.\nZig is a very young language that is being actively developed.\nAs a consequence, its world is still very wild and to be explored.\nThis book is my attempt to help you on your personal journey for\nunderstanding and exploring the exciting world of Zig.\n\nI assume you have previous experience with some programming\nlanguage in this book, not necessarily with a low-level one.\nSo, if you have experience with Python, or Javascript, for example, it will be fine.\nBut, if you do have experience with low-level languages, such as C, C++, or\nRust, you will probably learn faster throughout this book.\n\n## What is Zig?\n\nZig is a modern, low-level, and general-purpose programming language. Some programmers think of\nZig as a modern and better version of C.\n\nIn the author's personal interpretation, Zig is tightly connected with \"less is more\".\nInstead of trying to become a modern language by adding more and more features,\nmany of the core improvements that Zig brings to the\ntable are actually about removing annoying behaviours/features from C and C++.\nIn other words, Zig tries to be better by simplifying the language, and by having more consistent and robust behaviour.\nAs a result, analyzing, writing and debugging applications become much easier and simpler in Zig, than it is in C or C++.\n\nThis philosophy becomes clear with the following phrase from the official website of Zig:\n\n> \"Focus on debugging your application rather than debugging your programming language knowledge\".\n\nThis phrase is specially true for C++ programmers. Because C++ is a gigantic language,\nwith tons of features, and also, there are lots of different \"flavors of C++\". These elements\nare what makes C++ so complex and hard to learn. Zig tries to go in the opposite direction.\nZig is a very simple language, more closely related to other simple languages such as C and Go.\n\nThe phrase above is still important for C programmers too. Because, even C being a simple\nlanguage, it is still hard sometimes to read and understand C code. For example, pre-processor macros in\nC are a frequent source of confusion. They really make it sometimes hard to debug\nC programs. Because macros are essentially a second language embedded in C that obscures\nyour C code. With macros, you are no longer 100% sure about which pieces\nof the code are being sent to the compiler, i.e.\nthey obscures the actual source code that you wrote.\n\nYou don't have macros in Zig. In Zig, the code you write, is the actual code that get's compiled by the compiler.\nYou also don't have a hidden control flow happening behind the scenes. And, you also\ndon't have functions or operators from the standard library that make\nhidden memory allocations behind your back.\n\nBy being a simpler language, Zig becomes much more clear and easier to read/write,\nbut at the same time, it also achieves a much more robust state, with more consistent\nbehaviour in edge situations. Once again, less is more.\n\n\n## Hello world in Zig\n\nWe begin our journey in Zig by creating a small \"Hello World\" program.\nTo start a new Zig project in your computer, you simply call the `init` command\nfrom the `zig` compiler.\nJust create a new directory in your computer, then, init a new Zig project\ninside this directory, like this:\n\n```bash\nmkdir hello_world\ncd hello_world\nzig init\n```\n\n```\ninfo: created build.zig\ninfo: created build.zig.zon\ninfo: created src/main.zig\ninfo: created src/root.zig\ninfo: see `zig build --help` for a menu of options\n```\n\n### Understanding the project files {#sec-project-files}\n\nAfter you run the `init` command from the `zig` compiler, some new files\nare created inside of your current directory. First, a \"source\" (`src`) directory\nis created, containing two files, `main.zig` and `root.zig`. Each `.zig` file\nis a separate Zig module, which is simply a text file that contains some Zig code.\n\nBy convention, the `main.zig` module is where your main function lives. Thus,\nif you are building an executable program in Zig, you need to declare a `main()` function,\nwhich represents the entrypoint of your program, i.e. it is where the execution of your program begins.\n\nHowever, if you are building a library (instead of an executable program), then,\nthe normal procedure is to delete this `main.zig` file and start with the `root.zig` module.\nBy convention, the `root.zig` module is the root source file of your library.\n\n```bash\ntree .\n```\n\n```\n.\n├── build.zig\n├── build.zig.zon\n└── src\n ├── main.zig\n └── root.zig\n\n1 directory, 4 files\n```\n\nThe `ìnit` command also creates two additional files in our working directory:\n`build.zig` and `build.zig.zon`. The first file (`build.zig`) represents a build script written in Zig.\nThis script is executed when you call the `build` command from the `zig` compiler.\nIn other words, this file contain Zig code that executes the necessary steps to build the entire project.\n\n\nLow-level languages normally use a compiler to build your\nsource code into binary executables or binary libraries.\nNevertheless, this process of compiling your source code and building\nbinary executables or binary libraries from it, became a real challenge\nin the programming world, once the projects became bigger and bigger.\nAs a result, programmers created \"build systems\", which are a second set of tools designed to make this process\nof compiling and building complex projects, easier.\n\nExamples of build systems are CMake, GNU Make, GNU Autoconf and Ninja,\nwhich are used to build complex C and C++ projects.\nWith these systems, you can write scripts, which are called \"build scripts\".\nThey simply are scripts that describes the necessary steps to compile/build\nyour project.\n\nHowever, these are separate tools, that do not\nbelong to C/C++ compilers, like `gcc` or `clang`.\nAs a result, in C/C++ projects, you have not only to install and\nmanage your C/C++ compilers, but you also have to install and manage\nthese build systems separately.\n\nIn Zig, we don't need to use a separate set of tools to build our projects,\nbecause a build system is embedded inside the language itself.\nTherefore, Zig contains a native build system in it, and\nwe can use this build system to write small scripts in Zig,\nwhich describes the necessary steps to build/compile our Zig project[^zig-build-system].\nSo, everything you need to build a complex Zig project is the\n`zig` compiler, and nothing more.\n\n[^zig-build-system]: .\n\n\nThe second generated file (`build.zig.zon`) is the Zig package manager configuration file,\nwhere you can list and manage the dependencies of your project. Yes, Zig has\na package manager (like `pip` in Python, `cargo` in Rust, or `npm` in Javascript) called Zon,\nand this `build.zig.zon` file is similar to the `package.json` file\nin Javascript projects, or, the `Pipfile` file in Python projects,\nor the `Cargo.toml` file in Rust projects.\n\n\n### The file `root.zig` {#sec-root-file}\n\nLet's take a look into the `root.zig` file.\nYou might have noticed that every line of code with an expression ends with a semicolon (`;`).\nThis follows the syntax of a C-family programming language[^c-family].\n\n[^c-family]: \n\nAlso, notice the `@import()` call at the first line. We use this built-in function\nto import functionality from other Zig modules into our current module.\nThis `@import()` function works similarly to the `#include` pre-processor\nin C or C++, or, to the `import` statement in Python or Javascript code.\nIn this example, we are importing the `std` module,\nwhich gives you access to the Zig Standard Library.\n\nIn this `root.zig` file, we can also see how assignments (i.e. creating new objects)\nare made in Zig. You can create a new object in Zig by using the following syntax\n`(const|var) name = value;`. In the example below, we are creating two constant\nobjects (`std` and `testing`). At @sec-assignments we talk more about objects in general.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst testing = std.testing;\n\nexport fn add(a: i32, b: i32) i32 {\n return a + b;\n}\n```\n:::\n\n\n\n\nFunctions in Zig are declared using the `fn` keyword.\nIn this `root.zig` module, we are declaring a function called `add()`, which has two arguments named `a` and `b`.\nThe function returns an integer of the type `i32` as result.\n\n\nZig is a strongly-typed language. There are some specific situations where you can (if you want to) omit\nthe type of an object in your code, if this type can be inferred by the `zig` compiler (we talk more\nabout that at @sec-type-inference). But there are other situations where you do need to be explicit.\nFor example, you do have to explicitly specify the type of each function argument, and also,\nthe return type of every function that you create in Zig.\n\nWe specify the type of an object or a function argument in Zig by\nusing a colon character (`:`) followed by the type after the name of this object/function argument.\nWith the expressions `a: i32` and `b: i32`, we know that both `a` and `b` arguments have type `i32`,\nwhich is a signed 32 bit integer. In this part,\nthe syntax in Zig is identical to the syntax in Rust, which also specifies types by\nusing the colon character.\n\nLastly, we have the return type of the function at the end of the line, before we open\nthe curly braces to start writing the function's body. In the example above, this type is also\na signed 32 bit integer (`i32`) value.\n\nNotice that we also have an `export` keyword before the function declaration. This keyword\nis similar to the `extern` keyword in C. It exposes the function\nto make it available in the library API. Therefore, if you are writing\na library for other people to use, you have to expose the functions\nyou write in the public API of this library by using this `export` keyword.\nIf we removed the `export` keyword from the `add()` function declaration,\nthen, this function would be no longer exposed in the library object built\nby the `zig` compiler.\n\n\n### The `main.zig` file {#sec-main-file}\n\nNow that we have learned a lot about Zig's syntax from the `root.zig` file,\nlet's take a look at the `main.zig` file.\nA lot of the elements we saw in `root.zig` are also present in `main.zig`.\nBut there are some other elements that we haven't seen yet, so let's dive in.\n\nFirst, look at the return type of the `main()` function in this file.\nWe can see a small change. The return\ntype of the function (`void`) is accompanied by an exclamation mark (`!`).\nThis exclamation mark tells us that this `main()` function\nmight return an error.\n\nIn this example, the `main()` function can either return `void` or return an error.\nThis is an interesting feature of Zig. If you write a function and something inside of\nthe body of this function might return an error then you are forced to:\n\n- either add the exclamation mark to the return type of the function and make it clear that\nthis function might return an error\n- explicitly handle this error inside the function\n\nIn most programming languages, we normally handle (or deal with) an error through\na *try catch* pattern. Zig do have both `try` and `catch` keywords. But they work\na little differently than what you're probably used to in other languages.\n\nIf we look at the `main()` function below, you can see that we do have a `try` keyword\non the 5th line. But we do not have a `catch` keyword in this code.\nIn Zig, we use the `try` keyword to execute an expression that might return an error,\nwhich, in this example, is the `stdout.print()` expression.\n\nIn essence, the `try` keyword executes the expression `stdout.print()`. If this expression\nreturns a valid value, then, the `try` keyword do nothing. It only passes the value forward.\nBut if the expression does return an error, then, the `try` keyword just unwrap the error value,\nand return this error from the function and also prints the current stack trace to `stderr`.\n\nThis might sound weird to you if you come from a high-level language. Because in\nhigh-level languages, such as Python, if an error occurs somewhere, this error is automatically\nreturned and the execution of your program will automatically stop even if you don't want\nto stop the execution. You are obligated to face the error.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\n\npub fn main() !void {\n const stdout = std.io.getStdOut().writer();\n try stdout.print(\"Hello, {s}!\\n\", .{\"world\"});\n}\n```\n:::\n\n\n\n\nAnother thing that you might have noticed in this code example, is that\nthe `main()` function is marked with the `pub` keyword.\nIt marks the `main()` function as a *public function* from this module.\n\nEvery function in your Zig module is by default private to this Zig module and can only be called from within the module.\nUnless, you explicitly mark this function as a public function with the `pub` keyword.\nThis means that the `pub` keyword in Zig do essentially the opposite of what the `static` keyword\ndo in C/C++.\n\nBy making a function \"public\" you allow other Zig modules to access and call it.\nA calling Zig module imports the module with the `@import()`\nbuilt-in. That makes all public functions from the imported module visible.\n\n\n### Compiling your source code {#sec-compile-code}\n\nYou can compile your Zig modules into a binary executable by running the `build-exe` command\nfrom the `zig` compiler. You simply list all the Zig modules that you want to build after\nthe `build-exe` command, separated by spaces. In the example below, we are compiling the module `main.zig`.\n\n```bash\nzig build-exe src/main.zig\n```\n\nSince we are building an executable, the `zig` compiler will look for a `main()` function\ndeclared in any of the files that you list after the `build-exe` command. If\nthe compiler does not find a `main()` function declared somewhere, a\ncompilation error will be raised, warning about this mistake.\n\nThe `zig` compiler also offers a `build-lib` and `build-obj` commands, which work\nthe exact same way as the `build-exe` command. The only difference is that, they compile your\nZig modules into a portale C ABI library, or, into object files, respectively.\n\nIn the case of the `build-exe` command, a binary executable file is created by the `zig`\ncompiler in the root directory of your project.\nIf we take a look now at the contents of our current directory, with a simple `ls` command, we can\nsee the binary file called `main` that was created by the compiler.\n\n```bash\nls\n```\n\n```\nbuild.zig build.zig.zon main src\n```\n\nIf I execute this binary executable, I get the \"Hello World\" message in the terminal\n, as we expected.\n\n```bash\n./main\n```\n\n```\nHello, world!\n```\n\n\n### Compile and execute at the same time {#sec-compile-run-code}\n\nOn the previous section, I presented the `zig build-exe` command, which\ncompiles Zig modules into an executable file. However, this means that,\nin order to execute the executable file, we have to run two different commands.\nFirst, the `zig build-exe` command, and then, we call the executable file\ncreated by the compiler.\n\nBut what if we wanted to perform these two steps,\nall at once, in a single command? We can do that by using the `zig run`\ncommand.\n\n```bash\nzig run src/main.zig\n```\n\n```\nHello, world!\n```\n\n### Compiling the entire project {#sec-compile-project}\n\nJust as I described at @sec-project-files, as our project grows in size and\ncomplexity, we usually prefer to organize the compilation and build process\nof the project into a build script, using some sort of \"build system\".\n\nIn other words, as our project grows in size and complexity,\nthe `build-exe`, `build-lib` and `build-obj` commands become\nharder to use directly. Because then, we start to list\nmultiple and multiple modules at the same time. We also\nstart to add built-in compilation flags to customize the\nbuild process for our needs, etc. It becomes a lot of work\nto write the necessary commands by hand.\n\nIn C/C++ projects, programmers normally opt to use CMake, Ninja, `Makefile` or `configure` scripts\nto organize this process. However, in Zig, we have a native build system in the language itself.\nSo, we can write build scripts in Zig to compile and build Zig projects. Then, all we\nneed to do, is to call the `zig build` command to build our project.\n\nSo, when you execute the `zig build` command, the `zig` compiler will search\nfor a Zig module named `build.zig` inside your current directory, which\nshould be your build script, containing the necessary code to compile and\nbuild your project. If the compiler do find this `build.zig` file in your directory,\nthen, the compiler will essentially execute a `zig run` command\nover this `build.zig` file, to compile and execute this build\nscript, which in turn, will compile and build your entire project.\n\n\n```bash\nzig build\n```\n\n\nAfter you execute this \"build project\" command, a `zig-out` directory\nis created in the root of your project directory, where you can find\nthe binary executables and libraries created from your Zig modules\naccordingly to the build commands that you specified at `build.zig`.\nWe will talk more about the build system in Zig latter in this book.\n\nIn the example below, I'm executing the binary executable\nnamed `hello_world` that was generated by the compiler after the\n`zig build` command.\n\n```bash\n./zig-out/bin/hello_world\n```\n\n```\nHello, world!\n```\n\n\n\n## How to learn Zig?\n\nWhat are the best strategies to learn Zig? \nFirst of all, of course this book will help you a lot on your journey through Zig.\nBut you will also need some extra resources if you want to be really good at Zig.\n\nAs a first tip, you can join a community with Zig programmers to get some help\n, when you need it:\n\n- Reddit forum: ;\n- Ziggit community: ;\n- Discord, Slack, Telegram, and others: ;\n\nNow, one of the best ways to learn Zig is to simply read Zig code. Try\nto read Zig code often, and things will become more clear.\nA C/C++ programmer would also probably give you this same tip.\nBecause this strategy really works!\n\nNow, where you can find Zig code to read?\nI personally think that, the best way of reading Zig code is to read the source code of the\nZig Standard Library. The Zig Standard Library is available at the [`lib/std` folder](https://github.com/ziglang/zig/tree/master/lib/std)[^zig-lib-std] on\nthe official GitHub repository of Zig. Access this folder, and start exploring the Zig modules.\n\nAlso, a great alternative is to read code from other large Zig\ncodebases, such as:\n\n1. the [Javascript runtime Bun](https://github.com/oven-sh/bun)[^bunjs].\n1. the [game engine Mach](https://github.com/hexops/mach)[^mach].\n1. a [LLama 2 LLM model implementation in Zig](https://github.com/cgbur/llama2.zig/tree/main)[^ll2].\n1. the [financial transactions database `tigerbeetle`](https://github.com/tigerbeetle/tigerbeetle)[^tiger].\n1. the [command-line arguments parser `zig-clap`](https://github.com/Hejsil/zig-clap)[^clap].\n1. the [UI framework `capy`](https://github.com/capy-ui/capy)[^capy].\n1. the [Language Protocol implementation for Zig, `zls`](https://github.com/zigtools/zls)[^zls].\n1. the [event-loop library `libxev`](https://github.com/mitchellh/libxev)[^xev].\n\n[^xev]: \n[^zls]: \n[^capy]: \n[^clap]: \n[^tiger]: \n[^ll2]: \n[^mach]: \n[^bunjs]: .\n\nAll these assets are available on GitHub,\nand this is great, because we can use the GitHub search bar in our advantage,\nto find Zig code that fits our description.\nFor example, you can always include `lang:Zig` in the GitHub search bar when you\nare searching for a particular pattern. This will limit the search to only Zig modules.\n\n[^zig-lib-std]: \n\nAlso, a great alternative is to consult online resources and documentations.\nHere is a quick list of resources that I personally use from time to time to learn\nmore about the language each day:\n\n- Zig Language Reference: ;\n- Zig Standard Library Reference: ;\n- Zig Guide: ;\n- Karl Seguin Blog: ;\n- Zig News: ;\n- Read the code written by one of the Zig core team members: ;\n- Some livecoding sessions are transmitted in the Zig Showtime Youtube Channel: ;\n\n\nAnother great strategy to learn Zig, or honestly, to learn any language you want,\nis to practice it by solving exercises. For example, there is a famous repository\nin the Zig community called [Ziglings](https://ziglings.org)[^ziglings]\n, which contains more than 100 small exercises that you can solve. It is a repository of\ntiny programs written in Zig that are currently broken, and your responsibility is to\nfix these programs, and make them work again.\n\n[^ziglings]: .\n\nA famous tech YouTuber known as *The Primeagen* also posted some videos (at YouTube)\nwhere he solves these exercises from Ziglings. The first video is named\n[\"Trying Zig Part 1\"](https://www.youtube.com/watch?v=OPuztQfM3Fg&t=2524s&ab_channel=TheVimeagen)[^prime1].\n\n[^prime1]: .\n\nAnother great alternative, is to solve the [Advent of Code exercises](https://adventofcode.com/)[^advent-code].\nThere are people that already took the time to learn and solve the exercises, and they posted\ntheir solutions on GitHub as well, so, in case you need some resource to compare while solving\nthe exercises, you can look at these two repositories:\n\n- ;\n- ;\n\n[^advent-code]: \n\n\n\n\n\n\n## Creating new objects in Zig (i.e. identifiers) {#sec-assignments}\n\nLet's talk more about objects in Zig. Readers that have past experience\nwith other programming languages might know this concept through\na different name, such as: \"variable\" or \"identifier\". In this book, I choose\nto use the term \"object\" to refer to this concept.\n\nTo create a new object (or a new \"identifier\") in Zig, we use\nthe keywords `const` or `var`. These keywords specify if the object\nthat you are creating is mutable or not.\nIf you use `const`, then the object you are\ncreating is a constant (or immutable) object, which means that once you declare this object, you\ncan no longer change the value stored inside this object.\n\nOn the other side, if you use `var`, then, you are creating a variable (or mutable) object.\nYou can change the value of this object as many times you want. Using the\nkeyword `var` in Zig is similar to using the keywords `let mut` in Rust.\n\n### Constant objects vs variable objects\n\nIn the code example below, we are creating a new constant object called `age`.\nThis object stores a number representing the age of someone. However, this code example\ndoes not compiles successfully. Because on the next line of code, we are trying to change the value\nof the object `age` to 25.\n\nThe `zig` compiler detects that we are trying to change\nthe value of an object/identifier that is constant, and because of that,\nthe compiler will raise a compilation error, warning us about the mistake.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst age = 24;\n// The line below is not valid!\nage = 25;\n```\n:::\n\n\n\n\n```\nt.zig:10:5: error: cannot assign to constant\n age = 25;\n ~~^~~\n```\n\nIn contrast, if you use `var`, then, the object created is a variable object.\nWith `var` you can declare this object in your source code, and then,\nchange the value of this object how many times you want over future points\nin your source code.\n\nSo, using the same code example exposed above, if I change the declaration of the\n`age` object to use the `var` keyword, then, the program gets compiled successfully.\nBecause now, the `zig` compiler detects that we are changing the value of an\nobject that allows this behaviour, because it is an \"variable object\".\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar age: u8 = 24;\nage = 25;\n```\n:::\n\n\n\n\n\n### Declaring without an initial value\n\nBy default, when you declare a new object in Zig, you must give it\nan initial value. In other words, this means\nthat we have to declare, and, at the same time, initialize every object we\ncreate in our source code.\n\nOn the other hand, you can, in fact, declare a new object in your source code,\nand not give it an explicit value. But we need to use a special keyword for that,\nwhich is the `undefined` keyword.\n\nIs important to emphasize that, you should avoid using `undefined` as much as possible.\nBecause when you use this keyword, you leave your object uninitialized, and, as a consequence,\nif for some reason, your code use this object while it is uninitialized, then, you will definitely\nhave undefined behaviour and major bugs in your program.\n\nIn the example below, I'm declaring the `age` object again. But this time,\nI do not give it an initial value. The variable is only initialized at\nthe second line of code, where I store the number 25 in this object.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar age: u8 = undefined;\nage = 25;\n```\n:::\n\n\n\n\nHaving these points in mind, just remember that you should avoid as much as possible to use `undefined` in your code.\nAlways declare and initialize your objects. Because this gives you much more safety in your program.\nBut in case you really need to declare an object without initializing it... the\n`undefined` keyword is the way to do it in Zig.\n\n\n### There is no such thing as unused objects\n\nEvery object (being constant or variable) that you declare in Zig **must be used in some way**. You can give this object\nto a function call, as a function argument, or, you can use it in another expression\nto calculate the value of another object, or, you can call a method that belongs to this\nparticular object. \n\nIt doesn't matter in which way you use it. As long as you use it.\nIf you try to break this rule, i.e. if your try to declare a object, but not use it,\nthe `zig` compiler will not compile your Zig source code, and it will issue a error\nmessage warning that you have unused objects in your code.\n\nLet's demonstrate this with an example. In the source code below, we declare a constant object\ncalled `age`. If you try to compile a simple Zig program with this line of code below,\nthe compiler will return an error as demonstrated below:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst age = 15;\n```\n:::\n\n\n\n\n```\nt.zig:4:11: error: unused local constant\n const age = 15;\n ^~~\n```\n\nEverytime you declare a new object in Zig, you have two choices:\n\n1. you either use the value of this object;\n1. or you explicitly discard the value of the object;\n\nTo explicitly discard the value of any object (constant or variable), all you need to do is to assign\nthis object to an special character in Zig, which is the underscore (`_`).\nWhen you assign an object to a underscore, like in the example below, the `zig` compiler will automatically\ndiscard the value of this particular object.\n\nYou can see in the example below that, this time, the compiler did not\ncomplain about any \"unused constant\", and successfully compiled our source code.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// It compiles!\nconst age = 15;\n_ = age;\n```\n:::\n\n\n\n\nNow, remember, everytime you assign a particular object to the underscore, this object\nis essentially destroyed. It is discarded by the compiler. This means that you can no longer\nuse this object further in your code. It doesn't exist anymore.\n\nSo if you try to use the constant `age` in the example below, after we discarded it, you\nwill get a loud error message from the compiler (talking about a \"pointless discard\")\nwarning you about this mistake.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// It does not compile.\nconst age = 15;\n_ = age;\n// Using a discarded value!\nstd.debug.print(\"{d}\\n\", .{age + 2});\n```\n:::\n\n\n\n\n```\nt.zig:7:5: error: pointless discard\n of local constant\n```\n\n\nThis same rule applies to variable objects. Every variable object must also be used in\nsome way. And if you assign a variable object to the underscore,\nthis object also get's discarded, and you can no longer use this object.\n\n\n\n### You must mutate every variable objects\n\nEvery variable object that you create in your source code must be mutated at some point.\nIn other words, if you declare an object as a variable\nobject, with the keyword `var`, and you do not change the value of this object\nat some point in the future, the `zig` compiler will detect this,\nand it will raise an error warning you about this mistake.\n\nThe concept behind this is that every object you create in Zig should be preferably a\nconstant object, unless you really need an object whose value will\nchange during the execution of your program.\n\nSo, if I try to declare a variable object such as `where_i_live` below,\nand I do not change the value of this object in some way,\nthe `zig` compiler raises an error message with the phrase \"variable is never mutated\".\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar where_i_live = \"Belo Horizonte\";\n_ = where_i_live;\n```\n:::\n\n\n\n\n```\nt.zig:7:5: error: local variable is never mutated\nt.zig:7:5: note: consider using 'const'\n```\n\n## Primitive Data Types {#sec-primitive-data-types}\n\nZig have many different primitive data types available for you to use.\nYou can see the full list of available data types at the official\n[Language Reference page](https://ziglang.org/documentation/master/#Primitive-Types)[^lang-data-types].\n\n[^lang-data-types]: .\n\nBut here is a quick list:\n\n- Unsigned integers: `u8`, 8-bit integer; `u16`, 16-bit integer; `u32`, 32-bit integer; `u64`, 64-bit integer; `u128`, 128-bit integer.\n- Signed integers: `i8`, 8-bit integer; `i16`, 16-bit integer; `i32`, 32-bit integer; `i64`, 64-bit integer; `i128`, 128-bit integer.\n- Float number: `f16`, 16-bit floating point; `f32`, 32-bit floating point; `f64`, 64-bit floating point; `f128`, 128-bit floating point;\n- Boolean: `bool`, represents true or false values.\n- C ABI compatible types: `c_long`, `c_char`, `c_short`, `c_ushort`, `c_int`, `c_uint`, and many others.\n- Pointer sized integers: `isize` and `usize`.\n\n\n\n\n\n\n\n## Arrays {#sec-arrays}\n\nYou create arrays in Zig by using a syntax that resembles the C syntax.\nFirst, you specify the size of the array (i.e. the number of elements that will be stored in the array)\nyou want to create inside a pair of brackets.\n\nThen, you specify the data type of the elements that will be stored inside this array.\nAll elements present in an array in Zig must have the same data type. For example, you cannot mix elements\nof type `f32` with elements of type `i32` in the same array.\n\nAfter that, you simply list the values that you want to store in this array inside\na pair of curly braces.\nIn the example below, I am creating two constant objets that contain different arrays.\nThe first object contains an array of 4 integer values, while the second object,\nan array of 3 floating point values.\n\nNow, you should notice that in the object `ls`, I am\nnot explicitly specifying the size of the array inside of the brackets. Instead\nof using a literal value (like the value 4 that I used in the `ns` object), I am\nusing the special character underscore (`_`). This syntax tells the `zig` compiler\nto fill this field with the number of elements listed inside of the curly braces.\nSo, this syntax `[_]` is for lazy (or smart) programmers who leave the job of\ncounting how many elements there are in the curly braces for the compiler.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst ls = [_]f64{432.1, 87.2, 900.05};\n_ = ns; _ = ls;\n```\n:::\n\n\n\n\nIs worth noting that these are static arrays, meaning that\nthey cannot grow in size.\nOnce you declare your array, you cannot change the size of it.\nThis is very common in low level languages.\nBecause low level languages normally wants to give you (the programmer) full control over memory,\nand the way in which arrays are expanded is tightly related to\nmemory management.\n\n\n### Selecting elements of the array {#sec-select-array-elem}\n\nOne very common activity is to select specific portions of an array\nyou have in your source code.\nIn Zig, you can select a specific element from your\narray, by simply providing the index of this particular\nelement inside brackets after the object name.\nIn the example below, I am selecting the third element from the\n`ns` array. Notice that Zig is a \"zero-index\" based language,\nlike C, C++, Rust, Python, and many other languages.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\ntry stdout.print(\"{d}\\n\", .{ ns[2] });\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n12\n```\n\n\n:::\n:::\n\n\n\n\nIn contrast, you can also select specific slices (or sections) of your array, by using a\nrange selector. Some programmers also call these selectors of \"slice selectors\",\nand they also exist in Rust, and have the exact same syntax as in Zig.\nAnyway, a range selector is a special expression in Zig that defines\na range of indexes, and it have the syntax `start..end`.\n\nIn the example below, at the second line of code,\nthe `sl` object stores a slice (or a portion) of the\n`ns` array. More precisely, the elements at index 1 and 2\nin the `ns` array. \n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst sl = ns[1..3];\n_ = sl;\n```\n:::\n\n\n\n\nWhen you use the `start..end` syntax,\nthe \"end tail\" of the range selector is non-inclusive,\nmeaning that, the index at the end is not included in the range that is\nselected from the array.\nTherefore, the syntax `start..end` actually means `start..end - 1` in practice.\n\nYou can for example, create a slice that goes from the first to the\nlast elements of the array, by using `ar[0..ar.len]` syntax\nIn other words, it is a slice that\naccess all elements in the array.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ar = [4]u8{48, 24, 12, 6};\nconst sl = ar[0..ar.len];\n_ = sl;\n```\n:::\n\n\n\n\nYou can also use the syntax `start..` in your range selector.\nWhich tells the `zig` compiler to select the portion of the array\nthat begins at the `start` index until the last element of the array.\nIn the example below, we are selecting the range from index 1\nuntil the end of the array.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst sl = ns[1..];\n_ = sl;\n```\n:::\n\n\n\n\n\n### More on slices\n\nAs we discussed before, in Zig, you can select specific portions of an existing\narray. This is called *slicing* in Zig [@zigguide], because when you select a portion\nof an array, you are creating a slice object from that array.\n\nA slice object is essentially a pointer object accompanied by a length number.\nThe pointer object points to the first element in the slice, and the\nlength number tells the `zig` compiler how many elements there are in this slice.\n\n> Slices can be thought of as a pair of `[*]T` (the pointer to the data) and a `usize` (the element count) [@zigguide].\n\nThrough the pointer contained inside the slice you can access the elements (or values)\nthat are inside this range (or portion) that you selected from the original array.\nBut the length number (which you can access through the `len` property of your slice object)\nis the really big improvement (over C arrays for example) that Zig brings to the table here.\n\nBecause with this length number\nthe `zig` compiler can easily check if you are trying to access an index that is out of the bounds of this particular slice,\nor, if you are causing any buffer overflow problems. In the example below,\nwe access the `len` property of the slice `sl`, which tells us that this slice\nhave 2 elements in it.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst sl = ns[1..3];\ntry stdout.print(\"{d}\\n\", .{sl.len});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n2\n```\n\n\n:::\n:::\n\n\n\n\n\n### Array operators\n\nThere are two array operators available in Zig that are very useful.\nThe array concatenation operator (`++`), and the array multiplication operator (`**`). As the name suggests,\nthese are array operators.\n\nOne important detail about these two operators is that they work\nonly when both operands have a size (or \"length\") that is compile-time known.\nWe are going to talk more about\nthe differences between \"compile-time known\" and \"runtime known\" at @sec-compile-time.\nBut for now, keep this information in mind, that you cannot use these operators in every situation.\n\nIn summary, the `++` operator creates a new array that is the concatenation,\nof both arrays provided as operands. So, the expression `a ++ b` produces\na new array which contains all the elements from arrays `a` and `b`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst a = [_]u8{1,2,3};\nconst b = [_]u8{4,5};\nconst c = a ++ b;\ntry stdout.print(\"{any}\\n\", .{c});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n{ 1, 2, 3, 4, 5 }\n```\n\n\n:::\n:::\n\n\n\n\nThis `++` operator is particularly useful to concatenate strings together.\nStrings in Zig are described in depth at @sec-zig-strings. In summary, a string object in Zig\nis essentially an arrays of bytes. So, you can use this array concatenation operator\nto effectively concatenate strings together.\n\nIn contrast, the `**` operator is used to replicate an array multiple\ntimes. In other words, the expression `a ** 3` creates a new array\nwhich contains the elements of the array `a` repeated 3 times.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst a = [_]u8{1,2,3};\nconst c = a ** 2;\ntry stdout.print(\"{any}\\n\", .{c});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n{ 1, 2, 3, 1, 2, 3 }\n```\n\n\n:::\n:::\n\n\n\n\n\n### Runtime versus compile-time known length in slices\n\nWe are going to talk a lot about the differences between compile-time known\nand runtime known across this book, especially at @sec-compile-time.\nBut the basic idea is that a thing is compile-time known, when we know\neverything (the value, the attributes and the characteristics) about this thing at compile-time.\nIn contrast, a runtime known thing is when the exact value of a thing is calculated only at runtime.\nTherefore, we don't know the value of this thing at compile-time, only at runtime.\n\nWe have learned at @sec-select-array-elem that slices are created by using a *range selector*,\nwhich represents a range of indexes. When this \"range of indexes\" (i.e. the start and the end of this range)\nis known at compile-time, the slice object that get's created is actually, under the hood, just\na single-item pointer to an array.\n\nYou don't need to precisely understand what that means now. We are going to talk a lot about pointers\nat @sec-pointer. For now, just understand that, when the range of indexes is known at compile-time,\nthe slice that get's created is just a pointer to an array, accompanied by a length value that\ntells the size of the slice.\n\nIf you have a slice object like this, i.e. a slice that has a compile-time known range,\nyou can use common pointer operations over this slice object. For example, you can \ndereference the pointer of this slice, by using the `.*` method, like you would\ndo on a normal pointer object.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst arr1 = [10]u64 {\n 1, 2, 3, 4, 5,\n 6, 7, 8, 9, 10\n};\n// This slice have a compile-time known range.\n// Because we know both the start and end of the range.\nconst slice = arr1[1..4];\n```\n:::\n\n\n\n\n\nOn the other hand, if the range of indexes is not known at compile time, then, the slice object\nthat get's created is not a pointer anymore, and, thus, it does not support pointer operations.\nFor example, maybe the start index is known at compile time, but the end index is not. In such\ncase, the range of the slice becomes runtime known only.\n\nIn the example below, the `slice` object have a runtime known range, because the end index of the range\nis not known at compile time. In other words, the size of the array at `buffer` is not known\nat compile time. When we execute this program, the size of the array might be 10, or, it might be 12\ndepending on where we execute it. Therefore, we don't know at compile time if\nthe slice object have a range of size 10, or, a range of size 12.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst builtin = @import(\"builtin\");\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n var n: usize = 0;\n if (builtin.target.os.tag == .windows) {\n n = 10;\n } else {\n n = 12;\n }\n const buffer = try allocator.alloc(u64, n);\n const slice = buffer[0..];\n _ = slice;\n}\n```\n:::\n\n\n\n\n\n## Blocks and scopes {#sec-blocks}\n\nBlocks are created in Zig by a pair of curly braces. A block is just a group of\nexpressions (or statements) contained inside of a pair of curly braces. All of these expressions that\nare contained inside of this pair of curly braces belongs to the same scope.\n\nIn other words, a block just delimits a scope in your code.\nThe objects that you define inside the same block belongs to the same\nscope, and, therefore, are accessible from within this scope.\nAt the same time, these objects are not accessible outside of this scope.\nSo, you could also say that blocks are used to limit the scope of the objects that you create in\nyour source code. In less technical terms, blocks are used to specify where in your source code\nyou can access whatever object you have in your source code.\n\nSo, a block is just a group of expressions contained inside a pair of curly braces.\nAnd every block have its own scope separated from the others.\nThe body of a function is a classic example of a block. If statements, for and while loops\n(and any other structure in the language that uses the pair of curly braces)\nare also examples of blocks.\n\nThis means that, every if statement, or for loop,\netc., that you create in your source code have its own separate scope.\nThat is why you can't access the objects that you defined inside\nof your for loop (or if statement) in an outer scope, i.e. a scope outside of the for loop.\nBecause you are trying to access an object that belongs to a scope that is different\nthan your current scope.\n\n\nYou can create blocks within blocks, with multiple levels of nesting.\nYou can also (if you want to) give a label to a particular block, with the colon character (`:`).\nJust write `label:` before you open the pair of curly braces that delimits your block. When you label a block\nin Zig, you can use the `break` keyword to return a value from this block, like as if it\nwas a function's body. You just write the `break` keyword, followed by the block label in the format `:label`,\nand the expression that defines the value that you want to return.\n\nLike in the example below, where we are returning the value from the `y` object\nfrom the block `add_one`, and saving the result inside the `x` object.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar y: i32 = 123;\nconst x = add_one: {\n y += 1;\n break :add_one y;\n};\nif (x == 124 and y == 124) {\n try stdout.print(\"Hey!\", .{});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHey!\n```\n\n\n:::\n:::\n\n\n\n\n\n\n\n\n## How strings work in Zig? {#sec-zig-strings}\n\nThe first project that we are going to build and discuss in this book is a base64 encoder/decoder (@sec-base64).\nBut in order for us to build such a thing, we need to get a better understanding on how strings work in Zig.\nSo let's discuss this specific aspect of Zig.\n\nIn Zig, a string literal value is just a pointer to a null-terminated array of bytes (i.e. the same thing as a C string).\nHowever, a string object in Zig is a little more than just a pointer. A string object\nin Zig is an object of type `[]const u8`, and, this object always contains two things: the\nsame null-terminated array of bytes that you would find in a string literal value, plus a length value.\nEach byte in this \"array of bytes\" is represented by an `u8` value, which is an unsigned 8 bit integer,\nso, it is equivalent to the C data type `unsigned char`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// This is a string literal value:\n\"A literal value\";\n// This is a string object:\nconst object: []const u8 = \"A string object\";\n```\n:::\n\n\n\n\nZig always assumes that this sequence of bytes is UTF-8 encoded. This might not be true for every\nsequence of bytes you're working with, but is not really Zig's job to fix the encoding of your strings\n(you can use [`iconv`](https://www.gnu.org/software/libiconv/)[^libiconv] for that).\nToday, most of the text in our modern world, especially on the web, should be UTF-8 encoded.\nSo if your string literal is not UTF-8 encoded, then, you will likely\nhave problems in Zig.\n\n[^libiconv]: \n\nLet’s take for example the word \"Hello\". In UTF-8, this sequence of characters (H, e, l, l, o)\nis represented by the sequence of decimal numbers 72, 101, 108, 108, 111. In hexadecimal, this\nsequence is `0x48`, `0x65`, `0x6C`, `0x6C`, `0x6F`. So if I take this sequence of hexadecimal values,\nand ask Zig to print this sequence of bytes as a sequence of characters (i.e. a string), then,\nthe text \"Hello\" will be printed into the terminal:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\n\npub fn main() !void {\n const bytes = [_]u8{0x48, 0x65, 0x6C, 0x6C, 0x6F};\n try stdout.print(\"{s}\\n\", .{bytes});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHello\n```\n\n\n:::\n:::\n\n\n\n\n\nIf you want to see the actual bytes that represents a string in Zig, you can use\na `for` loop to iterate through each byte in the string, and ask Zig to print each byte as an hexadecimal\nvalue to the terminal. You do that by using a `print()` statement with the `X` formatting specifier,\nlike you would normally do with the [`printf()` function](https://cplusplus.com/reference/cstdio/printf/)[^printfs] in C.\n\n[^printfs]: \n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_object = \"This is an example of string literal in Zig\";\n try stdout.print(\"Bytes that represents the string object: \", .{});\n for (string_object) |byte| {\n try stdout.print(\"{X} \", .{byte});\n }\n try stdout.print(\"\\n\", .{});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nBytes that represents the string object: 54 68 69 \n 73 20 69 73 20 61 6E 20 65 78 61 6D 70 6C 65 20 6F\n F 66 20 73 74 72 69 6E 67 20 6C 69 74 65 72 61 6C 2\n 20 69 6E 20 5A 69 67 \n```\n\n\n:::\n:::\n\n\n\n\n### Strings in C\n\nAt first glance, this looks very similar to how C treats strings as well. In more details, string values\nin C are treated internally as an array of arbitrary bytes, and this array is also null-terminated.\n\nBut one key difference between a Zig string and a C string, is that Zig also stores the length of\nthe array inside the string object. This small detail makes your code safer, because is much\neasier for the Zig compiler to check if you are trying to access an element that is \"out of bounds\", i.e. if\nyour trying to access memory that does not belong to you.\n\nTo achieve this same kind of safety in C, you have to do a lot of work that kind of seems pointless.\nSo getting this kind of safety is not automatic and much harder to do in C. For example, if you want\nto track the length of your string throughout your program in C, then, you first need to loop through\nthe array of bytes that represents this string, and find the null element (`'\\0'`) position to discover\nwhere exactly the array ends, or, in other words, to find how much elements the array of bytes contain.\n\nTo do that, you would need something like this in C. In this example, the C string stored in\nthe object `array` is 25 bytes long:\n\n\n\n\n::: {.cell}\n\n```{.c .cell-code}\n#include \nint main() {\n char* array = \"An example of string in C\";\n int index = 0;\n while (1) {\n if (array[index] == '\\0') {\n break;\n }\n index++;\n }\n printf(\"Number of elements in the array: %d\\n\", index);\n}\n```\n:::\n\n\n\n\n```\nNumber of elements in the array: 25\n```\n\nBut in Zig, you do not have to do this, because the object already contains a `len`\nfield which stores the length information of the array. As an example, the `string_object` object below is 43 bytes long:\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_object = \"This is an example of string literal in Zig\";\n try stdout.print(\"{d}\\n\", .{string_object.len});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n43\n```\n\n\n:::\n:::\n\n\n\n\n\n### A better look at the object type\n\nNow, we can inspect better the type of objects that Zig create. To check the type of any object in Zig, you can use the\n`@TypeOf()` function. If we look at the type of the `simple_array` object below, you will find that this object\nis a array of 4 elements. Each element is a signed integer of 32 bits which corresponds to the data type `i32` in Zig.\nThat is what an object of type `[4]i32` is.\n\nBut if we look closely at the type of the `string_object` object below, you will find that this object is a\nconstant pointer (hence the `*const` annotation) to an array of 43 elements (or 43 bytes). Each element is a\nsingle byte (more precisely, an unsigned 8 bit integer - `u8`), that is why we have the `[43:0]u8` portion of the type below.\nIn other words, the string stored inside the `string_object` object is 43 bytes long.\nThat is why you have the type `*const [43:0]u8` below.\n\nIn the case of `string_object`, it is a constant pointer (`*const`) because the object `string_object` is declared\nas constant in the source code (in the line `const string_object = ...`). So, if we changed that for some reason, if\nwe declare `string_object` as a variable object (i.e. `var string_object = ...`), then, `string_object` would be\njust a normal pointer to an array of unsigned 8-bit integers (i.e. `* [43:0]u8`).\n\nNow, if we create an pointer to the `simple_array` object, then, we get a constant pointer to an array of 4 elements (`*const [4]i32`),\nwhich is very similar to the type of the `string_object` object. This demonstrates that a string object (or a string literal)\nin Zig is already a pointer to an array.\n\nJust remember that a \"pointer to an array\" is different than an \"array\". So a string object in Zig is a pointer to an array\nof bytes, and not simply an array of bytes.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_object = \"This is an example of string literal in Zig\";\n const simple_array = [_]i32{1, 2, 3, 4};\n try stdout.print(\n \"Type of array object: {}\",\n .{@TypeOf(simple_array)}\n );\n try stdout.print(\n \"Type of string object: {}\",\n .{@TypeOf(string_object)}\n );\n try stdout.print(\n \"Type of a pointer that points to the array object: {}\",\n .{@TypeOf(&simple_array)}\n );\n}\n```\n:::\n\n\n\n\n```\nType of array object: [4]i32\nType of string object: *const [43:0]u8\nType of a pointer that points to\n the array object: *const [4]i32\n```\n\n\n### Byte vs unicode points\n\nIs important to point out that each byte in the array is not necessarily a single character.\nThis fact arises from the difference between a single byte and a single unicode point.\n\nThe encoding UTF-8 works by assigning a number (which is called a unicode point) to each character in\nthe string. For example, the character \"H\" is stored in UTF-8 as the decimal number 72. This means that\nthe number 72 is the unicode point for the character \"H\". Each possible character that can appear in a\nUTF-8 encoded string have its own unicode point.\n\nFor example, the Latin Capital Letter A With Stroke (Ⱥ) is represented by the number (or the unicode point)\n570. However, this decimal number (570) is higher than the maximum number stored inside a single byte, which\nis 255. In other words, the maximum decimal number that can be represented with a single byte is 255. That is why,\nthe unicode point 570 is actually stored inside the computer’s memory as the bytes `C8 BA`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_object = \"Ⱥ\";\n _ = try stdout.write(\n \"Bytes that represents the string object: \"\n );\n for (string_object) |char| {\n try stdout.print(\"{X} \", .{char});\n }\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nBytes that represents the string object: C8 BA \n```\n\n\n:::\n:::\n\n\n\n\n\nThis means that to store the character Ⱥ in an UTF-8 encoded string, we need to use two bytes together\nto represent the number 570. That is why the relationship between bytes and unicode points is not always\n1 to 1. Each unicode point is a single character in the string, but not always a single byte corresponds\nto a single unicode point.\n\nAll of this means that if you loop trough the elements of a string in Zig, you will be looping through the\nbytes that represents that string, and not through the characters of that string. In the Ⱥ example above,\nthe for loop needed two iterations (instead of a single iteration) to print the two bytes that represents this Ⱥ letter.\n\nNow, all english letters (or ASCII letters if you prefer) can be represented by a single byte in UTF-8. As a\nconsequence, if your UTF-8 string contains only english letters (or ASCII letters), then, you are lucky. Because\nthe number of bytes will be equal to the number of characters in that string. In other words, in this specific\nsituation, the relationship between bytes and unicode points is 1 to 1.\n\nBut on the other side, if your string contains other types of letters… for example, you might be working with\ntext data that contains, chinese, japanese or latin letters, then, the number of bytes necessary to represent\nyour UTF-8 string will likely be much higher than the number of characters in that string.\n\nIf you need to iterate through the characters of a string, instead of its bytes, then, you can use the\n`std.unicode.Utf8View` struct to create an iterator that iterates through the unicode points of your string.\n\nIn the example below, we loop through the japanese characters “アメリカ”. Each of the four characters in\nthis string is represented by three bytes. But the for loop iterates four times, one iteration for each\ncharacter/unicode point in this string:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n var utf8 = (\n (try std.unicode.Utf8View.init(\"アメリカ\"))\n .iterator()\n );\n while (utf8.nextCodepointSlice()) |codepoint| {\n try stdout.print(\n \"got codepoint {}\\n\",\n .{std.fmt.fmtSliceHexUpper(codepoint)}\n );\n }\n}\n```\n:::\n\n\n\n\n```\ngot codepoint E382A2\ngot codepoint E383A1\ngot codepoint E383AA\ngot codepoint E382AB\n```\n\n\n### Some useful functions for strings {#sec-strings-useful-funs}\n\nIn this section, I just want to quickly describe some functions from the Zig Standard Library\nthat are very useful to use when working with strings. Most notably:\n\n- `std.mem.eql()`: to compare if two strings are equal.\n- `std.mem.splitScalar()`: to split a string into an array of substrings given a delimiter value.\n- `std.mem.splitSequence()`: to split a string into an array of substrings given a substring delimiter.\n- `std.mem.startsWith()`: to check if string starts with substring.\n- `std.mem.endsWith()`: to check if string starts with substring.\n- `std.mem.trim()`: to remove specific values from both start and end of the string.\n- `std.mem.concat()`: to concatenate strings together.\n- `std.mem.count()`: to count the occurrences of substring in the string.\n- `std.mem.replace()`: to replace the occurrences of substring in the string.\n\nNotice that all of these functions come from the `mem` module of\nthe Zig Standard Library. This module contains multiple functions and methods\nthat are useful to work with memory and sequences of bytes in general.\n\nThe `eql()` function is used to check if two arrays of data are equal or not.\nSince strings are just arbitrary arrays of bytes, we can use this function to compare two strings together.\nThis function returns a boolean value indicating if the two strings are equal\nor not. The first argument of this function is the data type of the elements of the arrays\nthat are being compared.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst name: []const u8 = \"Pedro\";\ntry stdout.print(\n \"{any}\\n\", .{std.mem.eql(u8, name, \"Pedro\")}\n);\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntrue\n```\n\n\n:::\n:::\n\n\n\n\nThe `splitScalar()` and `splitSequence()` functions are useful to split\na string into multiple fragments, like the `split()` method from Python strings. The difference between these two\nmethods is that the `splitScalar()` uses a single character as the separator to\nsplit the string, while `splitSequence()` uses a sequence of characters (a.k.a. a substring)\nas the separator. There is a practical example of these functions later in the book.\n\nThe `startsWith()` and `endsWith()` functions are pretty straightforward. They\nreturn a boolean value indicating if the string (or, more precisely, if the array of data)\nbegins (`startsWith`) or ends (`endsWith`) with the sequence provided.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst name: []const u8 = \"Pedro\";\ntry stdout.print(\n \"{any}\\n\", .{std.mem.startsWith(u8, name, \"Pe\")}\n);\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntrue\n```\n\n\n:::\n:::\n\n\n\n\nThe `concat()` function, as the name suggests, concatenate two or more strings together.\nBecause the process of concatenating the strings involves allocating enough space to\naccomodate all the strings together, this `concat()` function receives an allocator\nobject as input.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst str1 = \"Hello\";\nconst str2 = \" you!\";\nconst str3 = try std.mem.concat(\n allocator, u8, &[_][]const u8{ str1, str2 }\n);\ntry stdout.print(\"{s}\\n\", .{str3});\n```\n:::\n\n\n\n\n```\nHello you!\n```\n\nAs you can imagine, the `replace()` function is used to replace substrings in a string by another substring.\nThis function works very similarly to the `replace()` method from Python strings. Therefore, you\nprovide a substring to search, and every time that the `replace()` function finds\nthis substring within the input string, it replaces this substring with the \"replacement substring\"\nthat you provided as input.\n\nIn the example below, we are taking the input string \"Hello\", and replacing all occurrences\nof the substring \"el\" inside this input string with \"34\", and saving the results inside the\n`buffer` object. As result, the `replace()` function returns an `usize` value that\nindicates how many replacements were performed.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst str1 = \"Hello\";\nvar buffer: [5]u8 = undefined;\nconst nrep = std.mem.replace(\n u8, str1, \"el\", \"34\", buffer[0..]\n);\ntry stdout.print(\"New string: {s}\\n\", .{buffer});\ntry stdout.print(\"N of replacements: {d}\\n\", .{nrep});\n```\n:::\n\n\n\n\n```\nNew string: H34lo\nN of replacements: 1\n```\n\n\n\n\n\n\n## Safety in Zig\n\nA general trend in modern low-level programming languages is safety. As our modern world\nbecome more interconnected with technology and computers,\nthe data produced by all of this technology becomes one of the most important\n(and also, one of the most dangerous) assets that we have.\n\nThis is probably the main reason why modern low-level programming languages\nhave been giving great attention to safety, especially memory safety, because\nmemory corruption is still the main target for hackers to exploit.\nThe reality is that we don't have an easy solution for this problem.\nFor now, we only have techniques and strategies that mitigates these\nproblems.\n\nAs Richard Feldman explains on his [most recent GOTO conference talk](https://www.youtube.com/watch?v=jIZpKpLCOiU&ab_channel=GOTOConferences)[^gotop]\n, we haven't figured it out yet a way to achieve **true safety in technology**.\nIn other words, we haven't found a way to build software that won't be exploited\nwith 100% certainty. We can greatly reduce the risks of our software being\nexploited, by ensuring memory safety for example. But this is not enough\nto achieve \"true safety\" territory.\n\nBecause even if you write your program in a \"safe language\", hackers can still\nexploit failures in the operational system where your program is running (e.g. maybe the\nsystem where your code is running have a \"backdoor exploit\" that can still\naffect your code in unexpected ways), or also, they can exploit the features\nfrom the architecture of your computer. A recently found exploit\nthat involves memory invalidation through a feature of \"memory tags\"\npresent in ARM chips is an example of that [@exploit1].\n\n[^gotop]: \n\nThe question is: what Zig and other languages have been doing to mitigate this problem?\nIf we take Rust as an example, Rust is, for the most part[^rust-safe], a memory safe\nlanguage by enforcing specific rules to the developer. In other words, the key feature\nof Rust, the *borrow checker*, forces you to follow a specific logic when you are writing\nyour Rust code, and the Rust compiler will always complain everytime you try to go out of this\npattern.\n\n[^rust-safe]: Actually, a lot of existing Rust code is still memory unsafe, because they communicate with external libraries through FFI (*foreign function interface*), which disables the borrow-checker features through the `unsafe` keyword.\n\n\nIn contrast, the Zig language is not a memory safe language by default.\nThere are some memory safety features that you get for free in Zig,\nespecially in arrays and pointer objects. But there are other tools\noffered by the language, that are not used by default.\nIn other words, the `zig` compiler does not obligates you to use such tools.\n\nThe tools listed below are related to memory safety. That is, they help you to achieve\nmemory safety in your Zig code:\n\n- `defer` allows you to keep free operations phisically close to allocations. This helps you to avoid memory leaks, \"use after free\", and also \"double-free\" problems. Furthermore, it also keeps free operations logically tied to the end of the current scope, which greatly reduces the mental overhead about object lifetime.\n- `errdefer` helps you to guarantee that your program frees the allocated memory, even if a runtime error occurs.\n- pointers and objects are non-nullable by default. This helps you to avoid memory problems that might arise from de-referencing null pointers.\n- Zig offers some native types of allocators (called \"testing allocators\") that can detect memory leaks and double-frees. These types of allocators are widely used on unit tests, so they transform your unit tests into a weapon that you can use to detect memory problems in your code.\n- arrays and slices in Zig have their lengths embedded in the object itself, which makes the `zig` compiler very effective on detecting \"index out-of-range\" type of errors, and avoiding buffer overflows.\n\n\nDespite these features that Zig offers that are related to memory safety issues, the language\nalso have some rules that help you to achieve another type of safety, which is more related to\nprogram logic safety. These rules are:\n\n- pointers and objects are non-nullable by default. Which eliminates an edge case that might break the logic of your program.\n- switch statements must exaust all possible options.\n- the `zig` compiler forces you to handle every possible error in your program.\n\n\n## Other parts of Zig\n\nWe already learned a lot about Zig's syntax, and also, some pretty technical\ndetails about it. Just as a quick recap:\n\n- We talked about how functions are written in Zig at @sec-root-file and @sec-main-file.\n- How to create new objects/identifiers at @sec-root-file and especially at @sec-assignments.\n- How strings work in Zig at @sec-zig-strings.\n- How to use arrays and slices at @sec-arrays.\n- How to import functionality from other Zig modules at @sec-root-file.\n\n\nBut, for now, this amount of knowledge is enough for us to continue with this book.\nLater, over the next chapters we will still talk more about other parts of\nZig's syntax that are also equally important. Such as:\n\n\n- How Object-Oriented programming can be done in Zig through *struct declarations* at @sec-structs-and-oop.\n- Basic control flow syntax at @sec-zig-control-flow.\n- Enums at @sec-enum;\n- Pointers and Optionals at @sec-pointer;\n- Error handling with `try` and `catch` at @sec-error-handling;\n- Unit tests at @sec-unittests;\n- Vectors at @sec-vectors-simd;\n- Build System at @sec-build-system;\n\n\n\n\n", - "supporting": [], + "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n\n\n\n\n\n# Introducing Zig\n\nIn this chapter, I want to introduce you to the world of Zig.\nZig is a very young language that is being actively developed.\nAs a consequence, its world is still very wild and to be explored.\nThis book is my attempt to help you on your personal journey for\nunderstanding and exploring the exciting world of Zig.\n\nI assume you have previous experience with some programming\nlanguage in this book, not necessarily with a low-level one.\nSo, if you have experience with Python, or Javascript, for example, it will be fine.\nBut, if you do have experience with low-level languages, such as C, C++, or\nRust, you will probably learn faster throughout this book.\n\n## What is Zig?\n\nZig is a modern, low-level, and general-purpose programming language. Some programmers think of\nZig as a modern and better version of C.\n\nIn the author's personal interpretation, Zig is tightly connected with \"less is more\".\nInstead of trying to become a modern language by adding more and more features,\nmany of the core improvements that Zig brings to the\ntable are actually about removing annoying behaviours/features from C and C++.\nIn other words, Zig tries to be better by simplifying the language, and by having more consistent and robust behaviour.\nAs a result, analyzing, writing and debugging applications become much easier and simpler in Zig, than it is in C or C++.\n\nThis philosophy becomes clear with the following phrase from the official website of Zig:\n\n> \"Focus on debugging your application rather than debugging your programming language knowledge\".\n\nThis phrase is specially true for C++ programmers. Because C++ is a gigantic language,\nwith tons of features, and also, there are lots of different \"flavors of C++\". These elements\nare what makes C++ so complex and hard to learn. Zig tries to go in the opposite direction.\nZig is a very simple language, more closely related to other simple languages such as C and Go.\n\nThe phrase above is still important for C programmers too. Because, even C being a simple\nlanguage, it is still hard sometimes to read and understand C code. For example, pre-processor macros in\nC are a frequent source of confusion. They really make it sometimes hard to debug\nC programs. Because macros are essentially a second language embedded in C that obscures\nyour C code. With macros, you are no longer 100% sure about which pieces\nof the code are being sent to the compiler, i.e.\nthey obscures the actual source code that you wrote.\n\nYou don't have macros in Zig. In Zig, the code you write, is the actual code that get's compiled by the compiler.\nYou also don't have a hidden control flow happening behind the scenes. And, you also\ndon't have functions or operators from the standard library that make\nhidden memory allocations behind your back.\n\nBy being a simpler language, Zig becomes much more clear and easier to read/write,\nbut at the same time, it also achieves a much more robust state, with more consistent\nbehaviour in edge situations. Once again, less is more.\n\n\n## Hello world in Zig\n\nWe begin our journey in Zig by creating a small \"Hello World\" program.\nTo start a new Zig project in your computer, you simply call the `init` command\nfrom the `zig` compiler.\nJust create a new directory in your computer, then, init a new Zig project\ninside this directory, like this:\n\n```bash\nmkdir hello_world\ncd hello_world\nzig init\n```\n\n```\ninfo: created build.zig\ninfo: created build.zig.zon\ninfo: created src/main.zig\ninfo: created src/root.zig\ninfo: see `zig build --help` for a menu of options\n```\n\n### Understanding the project files {#sec-project-files}\n\nAfter you run the `init` command from the `zig` compiler, some new files\nare created inside of your current directory. First, a \"source\" (`src`) directory\nis created, containing two files, `main.zig` and `root.zig`. Each `.zig` file\nis a separate Zig module, which is simply a text file that contains some Zig code.\n\nBy convention, the `main.zig` module is where your main function lives. Thus,\nif you are building an executable program in Zig, you need to declare a `main()` function,\nwhich represents the entrypoint of your program, i.e. it is where the execution of your program begins.\n\nHowever, if you are building a library (instead of an executable program), then,\nthe normal procedure is to delete this `main.zig` file and start with the `root.zig` module.\nBy convention, the `root.zig` module is the root source file of your library.\n\n```bash\ntree .\n```\n\n```\n.\n├── build.zig\n├── build.zig.zon\n└── src\n ├── main.zig\n └── root.zig\n\n1 directory, 4 files\n```\n\nThe `ìnit` command also creates two additional files in our working directory:\n`build.zig` and `build.zig.zon`. The first file (`build.zig`) represents a build script written in Zig.\nThis script is executed when you call the `build` command from the `zig` compiler.\nIn other words, this file contain Zig code that executes the necessary steps to build the entire project.\n\n\nLow-level languages normally use a compiler to build your\nsource code into binary executables or binary libraries.\nNevertheless, this process of compiling your source code and building\nbinary executables or binary libraries from it, became a real challenge\nin the programming world, once the projects became bigger and bigger.\nAs a result, programmers created \"build systems\", which are a second set of tools designed to make this process\nof compiling and building complex projects, easier.\n\nExamples of build systems are CMake, GNU Make, GNU Autoconf and Ninja,\nwhich are used to build complex C and C++ projects.\nWith these systems, you can write scripts, which are called \"build scripts\".\nThey simply are scripts that describes the necessary steps to compile/build\nyour project.\n\nHowever, these are separate tools, that do not\nbelong to C/C++ compilers, like `gcc` or `clang`.\nAs a result, in C/C++ projects, you have not only to install and\nmanage your C/C++ compilers, but you also have to install and manage\nthese build systems separately.\n\nIn Zig, we don't need to use a separate set of tools to build our projects,\nbecause a build system is embedded inside the language itself.\nTherefore, Zig contains a native build system in it, and\nwe can use this build system to write small scripts in Zig,\nwhich describes the necessary steps to build/compile our Zig project[^zig-build-system].\nSo, everything you need to build a complex Zig project is the\n`zig` compiler, and nothing more.\n\n[^zig-build-system]: .\n\n\nThe second generated file (`build.zig.zon`) is the Zig package manager configuration file,\nwhere you can list and manage the dependencies of your project. Yes, Zig has\na package manager (like `pip` in Python, `cargo` in Rust, or `npm` in Javascript) called Zon,\nand this `build.zig.zon` file is similar to the `package.json` file\nin Javascript projects, or, the `Pipfile` file in Python projects,\nor the `Cargo.toml` file in Rust projects.\n\n\n### The file `root.zig` {#sec-root-file}\n\nLet's take a look into the `root.zig` file.\nYou might have noticed that every line of code with an expression ends with a semicolon (`;`).\nThis follows the syntax of a C-family programming language[^c-family].\n\n[^c-family]: \n\nAlso, notice the `@import()` call at the first line. We use this built-in function\nto import functionality from other Zig modules into our current module.\nThis `@import()` function works similarly to the `#include` pre-processor\nin C or C++, or, to the `import` statement in Python or Javascript code.\nIn this example, we are importing the `std` module,\nwhich gives you access to the Zig Standard Library.\n\nIn this `root.zig` file, we can also see how assignments (i.e. creating new objects)\nare made in Zig. You can create a new object in Zig by using the following syntax\n`(const|var) name = value;`. In the example below, we are creating two constant\nobjects (`std` and `testing`). At @sec-assignments we talk more about objects in general.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst testing = std.testing;\n\nexport fn add(a: i32, b: i32) i32 {\n return a + b;\n}\n```\n:::\n\n\n\n\nFunctions in Zig are declared using the `fn` keyword.\nIn this `root.zig` module, we are declaring a function called `add()`, which has two arguments named `a` and `b`.\nThe function returns an integer of the type `i32` as result.\n\n\nZig is a strongly-typed language. There are some specific situations where you can (if you want to) omit\nthe type of an object in your code, if this type can be inferred by the `zig` compiler (we talk more\nabout that at @sec-type-inference). But there are other situations where you do need to be explicit.\nFor example, you do have to explicitly specify the type of each function argument, and also,\nthe return type of every function that you create in Zig.\n\nWe specify the type of an object or a function argument in Zig by\nusing a colon character (`:`) followed by the type after the name of this object/function argument.\nWith the expressions `a: i32` and `b: i32`, we know that both `a` and `b` arguments have type `i32`,\nwhich is a signed 32 bit integer. In this part,\nthe syntax in Zig is identical to the syntax in Rust, which also specifies types by\nusing the colon character.\n\nLastly, we have the return type of the function at the end of the line, before we open\nthe curly braces to start writing the function's body. In the example above, this type is also\na signed 32 bit integer (`i32`) value.\n\nNotice that we also have an `export` keyword before the function declaration. This keyword\nis similar to the `extern` keyword in C. It exposes the function\nto make it available in the library API. Therefore, if you are writing\na library for other people to use, you have to expose the functions\nyou write in the public API of this library by using this `export` keyword.\nIf we removed the `export` keyword from the `add()` function declaration,\nthen, this function would be no longer exposed in the library object built\nby the `zig` compiler.\n\n\n### The `main.zig` file {#sec-main-file}\n\nNow that we have learned a lot about Zig's syntax from the `root.zig` file,\nlet's take a look at the `main.zig` file.\nA lot of the elements we saw in `root.zig` are also present in `main.zig`.\nBut there are some other elements that we haven't seen yet, so let's dive in.\n\nFirst, look at the return type of the `main()` function in this file.\nWe can see a small change. The return\ntype of the function (`void`) is accompanied by an exclamation mark (`!`).\nThis exclamation mark tells us that this `main()` function\nmight return an error.\n\nIn this example, the `main()` function can either return `void` or return an error.\nThis is an interesting feature of Zig. If you write a function and something inside of\nthe body of this function might return an error then you are forced to:\n\n- either add the exclamation mark to the return type of the function and make it clear that\nthis function might return an error\n- explicitly handle this error inside the function\n\nIn most programming languages, we normally handle (or deal with) an error through\na *try catch* pattern. Zig do have both `try` and `catch` keywords. But they work\na little differently than what you're probably used to in other languages.\n\nIf we look at the `main()` function below, you can see that we do have a `try` keyword\non the 5th line. But we do not have a `catch` keyword in this code.\nIn Zig, we use the `try` keyword to execute an expression that might return an error,\nwhich, in this example, is the `stdout.print()` expression.\n\nIn essence, the `try` keyword executes the expression `stdout.print()`. If this expression\nreturns a valid value, then, the `try` keyword do nothing. It only passes the value forward.\nBut if the expression does return an error, then, the `try` keyword just unwrap the error value,\nand return this error from the function and also prints the current stack trace to `stderr`.\n\nThis might sound weird to you if you come from a high-level language. Because in\nhigh-level languages, such as Python, if an error occurs somewhere, this error is automatically\nreturned and the execution of your program will automatically stop even if you don't want\nto stop the execution. You are obligated to face the error.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\n\npub fn main() !void {\n const stdout = std.io.getStdOut().writer();\n try stdout.print(\"Hello, {s}!\\n\", .{\"world\"});\n}\n```\n:::\n\n\n\n\nAnother thing that you might have noticed in this code example, is that\nthe `main()` function is marked with the `pub` keyword.\nIt marks the `main()` function as a *public function* from this module.\n\nEvery function in your Zig module is by default private to this Zig module and can only be called from within the module.\nUnless, you explicitly mark this function as a public function with the `pub` keyword.\nThis means that the `pub` keyword in Zig do essentially the opposite of what the `static` keyword\ndo in C/C++.\n\nBy making a function \"public\" you allow other Zig modules to access and call it.\nA calling Zig module imports the module with the `@import()`\nbuilt-in. That makes all public functions from the imported module visible.\n\n\n### Compiling your source code {#sec-compile-code}\n\nYou can compile your Zig modules into a binary executable by running the `build-exe` command\nfrom the `zig` compiler. You simply list all the Zig modules that you want to build after\nthe `build-exe` command, separated by spaces. In the example below, we are compiling the module `main.zig`.\n\n```bash\nzig build-exe src/main.zig\n```\n\nSince we are building an executable, the `zig` compiler will look for a `main()` function\ndeclared in any of the files that you list after the `build-exe` command. If\nthe compiler does not find a `main()` function declared somewhere, a\ncompilation error will be raised, warning about this mistake.\n\nThe `zig` compiler also offers a `build-lib` and `build-obj` commands, which work\nthe exact same way as the `build-exe` command. The only difference is that, they compile your\nZig modules into a portale C ABI library, or, into object files, respectively.\n\nIn the case of the `build-exe` command, a binary executable file is created by the `zig`\ncompiler in the root directory of your project.\nIf we take a look now at the contents of our current directory, with a simple `ls` command, we can\nsee the binary file called `main` that was created by the compiler.\n\n```bash\nls\n```\n\n```\nbuild.zig build.zig.zon main src\n```\n\nIf I execute this binary executable, I get the \"Hello World\" message in the terminal\n, as we expected.\n\n```bash\n./main\n```\n\n```\nHello, world!\n```\n\n\n### Compile and execute at the same time {#sec-compile-run-code}\n\nOn the previous section, I presented the `zig build-exe` command, which\ncompiles Zig modules into an executable file. However, this means that,\nin order to execute the executable file, we have to run two different commands.\nFirst, the `zig build-exe` command, and then, we call the executable file\ncreated by the compiler.\n\nBut what if we wanted to perform these two steps,\nall at once, in a single command? We can do that by using the `zig run`\ncommand.\n\n```bash\nzig run src/main.zig\n```\n\n```\nHello, world!\n```\n\n### Compiling the entire project {#sec-compile-project}\n\nJust as I described at @sec-project-files, as our project grows in size and\ncomplexity, we usually prefer to organize the compilation and build process\nof the project into a build script, using some sort of \"build system\".\n\nIn other words, as our project grows in size and complexity,\nthe `build-exe`, `build-lib` and `build-obj` commands become\nharder to use directly. Because then, we start to list\nmultiple and multiple modules at the same time. We also\nstart to add built-in compilation flags to customize the\nbuild process for our needs, etc. It becomes a lot of work\nto write the necessary commands by hand.\n\nIn C/C++ projects, programmers normally opt to use CMake, Ninja, `Makefile` or `configure` scripts\nto organize this process. However, in Zig, we have a native build system in the language itself.\nSo, we can write build scripts in Zig to compile and build Zig projects. Then, all we\nneed to do, is to call the `zig build` command to build our project.\n\nSo, when you execute the `zig build` command, the `zig` compiler will search\nfor a Zig module named `build.zig` inside your current directory, which\nshould be your build script, containing the necessary code to compile and\nbuild your project. If the compiler do find this `build.zig` file in your directory,\nthen, the compiler will essentially execute a `zig run` command\nover this `build.zig` file, to compile and execute this build\nscript, which in turn, will compile and build your entire project.\n\n\n```bash\nzig build\n```\n\n\nAfter you execute this \"build project\" command, a `zig-out` directory\nis created in the root of your project directory, where you can find\nthe binary executables and libraries created from your Zig modules\naccordingly to the build commands that you specified at `build.zig`.\nWe will talk more about the build system in Zig latter in this book.\n\nIn the example below, I'm executing the binary executable\nnamed `hello_world` that was generated by the compiler after the\n`zig build` command.\n\n```bash\n./zig-out/bin/hello_world\n```\n\n```\nHello, world!\n```\n\n\n\n## How to learn Zig?\n\nWhat are the best strategies to learn Zig? \nFirst of all, of course this book will help you a lot on your journey through Zig.\nBut you will also need some extra resources if you want to be really good at Zig.\n\nAs a first tip, you can join a community with Zig programmers to get some help\n, when you need it:\n\n- Reddit forum: ;\n- Ziggit community: ;\n- Discord, Slack, Telegram, and others: ;\n\nNow, one of the best ways to learn Zig is to simply read Zig code. Try\nto read Zig code often, and things will become more clear.\nA C/C++ programmer would also probably give you this same tip.\nBecause this strategy really works!\n\nNow, where you can find Zig code to read?\nI personally think that, the best way of reading Zig code is to read the source code of the\nZig Standard Library. The Zig Standard Library is available at the [`lib/std` folder](https://github.com/ziglang/zig/tree/master/lib/std)[^zig-lib-std] on\nthe official GitHub repository of Zig. Access this folder, and start exploring the Zig modules.\n\nAlso, a great alternative is to read code from other large Zig\ncodebases, such as:\n\n1. the [Javascript runtime Bun](https://github.com/oven-sh/bun)[^bunjs].\n1. the [game engine Mach](https://github.com/hexops/mach)[^mach].\n1. a [LLama 2 LLM model implementation in Zig](https://github.com/cgbur/llama2.zig/tree/main)[^ll2].\n1. the [financial transactions database `tigerbeetle`](https://github.com/tigerbeetle/tigerbeetle)[^tiger].\n1. the [command-line arguments parser `zig-clap`](https://github.com/Hejsil/zig-clap)[^clap].\n1. the [UI framework `capy`](https://github.com/capy-ui/capy)[^capy].\n1. the [Language Protocol implementation for Zig, `zls`](https://github.com/zigtools/zls)[^zls].\n1. the [event-loop library `libxev`](https://github.com/mitchellh/libxev)[^xev].\n\n[^xev]: \n[^zls]: \n[^capy]: \n[^clap]: \n[^tiger]: \n[^ll2]: \n[^mach]: \n[^bunjs]: .\n\nAll these assets are available on GitHub,\nand this is great, because we can use the GitHub search bar in our advantage,\nto find Zig code that fits our description.\nFor example, you can always include `lang:Zig` in the GitHub search bar when you\nare searching for a particular pattern. This will limit the search to only Zig modules.\n\n[^zig-lib-std]: \n\nAlso, a great alternative is to consult online resources and documentations.\nHere is a quick list of resources that I personally use from time to time to learn\nmore about the language each day:\n\n- Zig Language Reference: ;\n- Zig Standard Library Reference: ;\n- Zig Guide: ;\n- Karl Seguin Blog: ;\n- Zig News: ;\n- Read the code written by one of the Zig core team members: ;\n- Some livecoding sessions are transmitted in the Zig Showtime Youtube Channel: ;\n\n\nAnother great strategy to learn Zig, or honestly, to learn any language you want,\nis to practice it by solving exercises. For example, there is a famous repository\nin the Zig community called [Ziglings](https://ziglings.org)[^ziglings]\n, which contains more than 100 small exercises that you can solve. It is a repository of\ntiny programs written in Zig that are currently broken, and your responsibility is to\nfix these programs, and make them work again.\n\n[^ziglings]: .\n\nA famous tech YouTuber known as *The Primeagen* also posted some videos (at YouTube)\nwhere he solves these exercises from Ziglings. The first video is named\n[\"Trying Zig Part 1\"](https://www.youtube.com/watch?v=OPuztQfM3Fg&t=2524s&ab_channel=TheVimeagen)[^prime1].\n\n[^prime1]: .\n\nAnother great alternative, is to solve the [Advent of Code exercises](https://adventofcode.com/)[^advent-code].\nThere are people that already took the time to learn and solve the exercises, and they posted\ntheir solutions on GitHub as well, so, in case you need some resource to compare while solving\nthe exercises, you can look at these two repositories:\n\n- ;\n- ;\n\n[^advent-code]: \n\n\n\n\n\n\n## Creating new objects in Zig (i.e. identifiers) {#sec-assignments}\n\nLet's talk more about objects in Zig. Readers that have past experience\nwith other programming languages might know this concept through\na different name, such as: \"variable\" or \"identifier\". In this book, I choose\nto use the term \"object\" to refer to this concept.\n\nTo create a new object (or a new \"identifier\") in Zig, we use\nthe keywords `const` or `var`. These keywords specify if the object\nthat you are creating is mutable or not.\nIf you use `const`, then the object you are\ncreating is a constant (or immutable) object, which means that once you declare this object, you\ncan no longer change the value stored inside this object.\n\nOn the other side, if you use `var`, then, you are creating a variable (or mutable) object.\nYou can change the value of this object as many times you want. Using the\nkeyword `var` in Zig is similar to using the keywords `let mut` in Rust.\n\n### Constant objects vs variable objects\n\nIn the code example below, we are creating a new constant object called `age`.\nThis object stores a number representing the age of someone. However, this code example\ndoes not compiles successfully. Because on the next line of code, we are trying to change the value\nof the object `age` to 25.\n\nThe `zig` compiler detects that we are trying to change\nthe value of an object/identifier that is constant, and because of that,\nthe compiler will raise a compilation error, warning us about the mistake.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst age = 24;\n// The line below is not valid!\nage = 25;\n```\n:::\n\n\n\n\n```\nt.zig:10:5: error: cannot assign to constant\n age = 25;\n ~~^~~\n```\n\nIn contrast, if you use `var`, then, the object created is a variable object.\nWith `var` you can declare this object in your source code, and then,\nchange the value of this object how many times you want over future points\nin your source code.\n\nSo, using the same code example exposed above, if I change the declaration of the\n`age` object to use the `var` keyword, then, the program gets compiled successfully.\nBecause now, the `zig` compiler detects that we are changing the value of an\nobject that allows this behaviour, because it is an \"variable object\".\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar age: u8 = 24;\nage = 25;\n```\n:::\n\n\n\n\n\n### Declaring without an initial value\n\nBy default, when you declare a new object in Zig, you must give it\nan initial value. In other words, this means\nthat we have to declare, and, at the same time, initialize every object we\ncreate in our source code.\n\nOn the other hand, you can, in fact, declare a new object in your source code,\nand not give it an explicit value. But we need to use a special keyword for that,\nwhich is the `undefined` keyword.\n\nIs important to emphasize that, you should avoid using `undefined` as much as possible.\nBecause when you use this keyword, you leave your object uninitialized, and, as a consequence,\nif for some reason, your code use this object while it is uninitialized, then, you will definitely\nhave undefined behaviour and major bugs in your program.\n\nIn the example below, I'm declaring the `age` object again. But this time,\nI do not give it an initial value. The variable is only initialized at\nthe second line of code, where I store the number 25 in this object.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar age: u8 = undefined;\nage = 25;\n```\n:::\n\n\n\n\nHaving these points in mind, just remember that you should avoid as much as possible to use `undefined` in your code.\nAlways declare and initialize your objects. Because this gives you much more safety in your program.\nBut in case you really need to declare an object without initializing it... the\n`undefined` keyword is the way to do it in Zig.\n\n\n### There is no such thing as unused objects\n\nEvery object (being constant or variable) that you declare in Zig **must be used in some way**. You can give this object\nto a function call, as a function argument, or, you can use it in another expression\nto calculate the value of another object, or, you can call a method that belongs to this\nparticular object. \n\nIt doesn't matter in which way you use it. As long as you use it.\nIf you try to break this rule, i.e. if your try to declare a object, but not use it,\nthe `zig` compiler will not compile your Zig source code, and it will issue a error\nmessage warning that you have unused objects in your code.\n\nLet's demonstrate this with an example. In the source code below, we declare a constant object\ncalled `age`. If you try to compile a simple Zig program with this line of code below,\nthe compiler will return an error as demonstrated below:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst age = 15;\n```\n:::\n\n\n\n\n```\nt.zig:4:11: error: unused local constant\n const age = 15;\n ^~~\n```\n\nEverytime you declare a new object in Zig, you have two choices:\n\n1. you either use the value of this object;\n1. or you explicitly discard the value of the object;\n\nTo explicitly discard the value of any object (constant or variable), all you need to do is to assign\nthis object to an special character in Zig, which is the underscore (`_`).\nWhen you assign an object to a underscore, like in the example below, the `zig` compiler will automatically\ndiscard the value of this particular object.\n\nYou can see in the example below that, this time, the compiler did not\ncomplain about any \"unused constant\", and successfully compiled our source code.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// It compiles!\nconst age = 15;\n_ = age;\n```\n:::\n\n\n\n\nNow, remember, everytime you assign a particular object to the underscore, this object\nis essentially destroyed. It is discarded by the compiler. This means that you can no longer\nuse this object further in your code. It doesn't exist anymore.\n\nSo if you try to use the constant `age` in the example below, after we discarded it, you\nwill get a loud error message from the compiler (talking about a \"pointless discard\")\nwarning you about this mistake.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// It does not compile.\nconst age = 15;\n_ = age;\n// Using a discarded value!\nstd.debug.print(\"{d}\\n\", .{age + 2});\n```\n:::\n\n\n\n\n```\nt.zig:7:5: error: pointless discard\n of local constant\n```\n\n\nThis same rule applies to variable objects. Every variable object must also be used in\nsome way. And if you assign a variable object to the underscore,\nthis object also get's discarded, and you can no longer use this object.\n\n\n\n### You must mutate every variable objects\n\nEvery variable object that you create in your source code must be mutated at some point.\nIn other words, if you declare an object as a variable\nobject, with the keyword `var`, and you do not change the value of this object\nat some point in the future, the `zig` compiler will detect this,\nand it will raise an error warning you about this mistake.\n\nThe concept behind this is that every object you create in Zig should be preferably a\nconstant object, unless you really need an object whose value will\nchange during the execution of your program.\n\nSo, if I try to declare a variable object such as `where_i_live` below,\nand I do not change the value of this object in some way,\nthe `zig` compiler raises an error message with the phrase \"variable is never mutated\".\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar where_i_live = \"Belo Horizonte\";\n_ = where_i_live;\n```\n:::\n\n\n\n\n```\nt.zig:7:5: error: local variable is never mutated\nt.zig:7:5: note: consider using 'const'\n```\n\n## Primitive Data Types {#sec-primitive-data-types}\n\nZig have many different primitive data types available for you to use.\nYou can see the full list of available data types at the official\n[Language Reference page](https://ziglang.org/documentation/master/#Primitive-Types)[^lang-data-types].\n\n[^lang-data-types]: .\n\nBut here is a quick list:\n\n- Unsigned integers: `u8`, 8-bit integer; `u16`, 16-bit integer; `u32`, 32-bit integer; `u64`, 64-bit integer; `u128`, 128-bit integer.\n- Signed integers: `i8`, 8-bit integer; `i16`, 16-bit integer; `i32`, 32-bit integer; `i64`, 64-bit integer; `i128`, 128-bit integer.\n- Float number: `f16`, 16-bit floating point; `f32`, 32-bit floating point; `f64`, 64-bit floating point; `f128`, 128-bit floating point;\n- Boolean: `bool`, represents true or false values.\n- C ABI compatible types: `c_long`, `c_char`, `c_short`, `c_ushort`, `c_int`, `c_uint`, and many others.\n- Pointer sized integers: `isize` and `usize`.\n\n\n\n\n\n\n\n## Arrays {#sec-arrays}\n\nYou create arrays in Zig by using a syntax that resembles the C syntax.\nFirst, you specify the size of the array (i.e. the number of elements that will be stored in the array)\nyou want to create inside a pair of brackets.\n\nThen, you specify the data type of the elements that will be stored inside this array.\nAll elements present in an array in Zig must have the same data type. For example, you cannot mix elements\nof type `f32` with elements of type `i32` in the same array.\n\nAfter that, you simply list the values that you want to store in this array inside\na pair of curly braces.\nIn the example below, I am creating two constant objets that contain different arrays.\nThe first object contains an array of 4 integer values, while the second object,\nan array of 3 floating point values.\n\nNow, you should notice that in the object `ls`, I am\nnot explicitly specifying the size of the array inside of the brackets. Instead\nof using a literal value (like the value 4 that I used in the `ns` object), I am\nusing the special character underscore (`_`). This syntax tells the `zig` compiler\nto fill this field with the number of elements listed inside of the curly braces.\nSo, this syntax `[_]` is for lazy (or smart) programmers who leave the job of\ncounting how many elements there are in the curly braces for the compiler.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst ls = [_]f64{432.1, 87.2, 900.05};\n_ = ns; _ = ls;\n```\n:::\n\n\n\n\nIs worth noting that these are static arrays, meaning that\nthey cannot grow in size.\nOnce you declare your array, you cannot change the size of it.\nThis is very common in low level languages.\nBecause low level languages normally wants to give you (the programmer) full control over memory,\nand the way in which arrays are expanded is tightly related to\nmemory management.\n\n\n### Selecting elements of the array {#sec-select-array-elem}\n\nOne very common activity is to select specific portions of an array\nyou have in your source code.\nIn Zig, you can select a specific element from your\narray, by simply providing the index of this particular\nelement inside brackets after the object name.\nIn the example below, I am selecting the third element from the\n`ns` array. Notice that Zig is a \"zero-index\" based language,\nlike C, C++, Rust, Python, and many other languages.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\ntry stdout.print(\"{d}\\n\", .{ ns[2] });\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n12\n```\n\n\n:::\n:::\n\n\n\n\nIn contrast, you can also select specific slices (or sections) of your array, by using a\nrange selector. Some programmers also call these selectors of \"slice selectors\",\nand they also exist in Rust, and have the exact same syntax as in Zig.\nAnyway, a range selector is a special expression in Zig that defines\na range of indexes, and it have the syntax `start..end`.\n\nIn the example below, at the second line of code,\nthe `sl` object stores a slice (or a portion) of the\n`ns` array. More precisely, the elements at index 1 and 2\nin the `ns` array. \n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst sl = ns[1..3];\n_ = sl;\n```\n:::\n\n\n\n\nWhen you use the `start..end` syntax,\nthe \"end tail\" of the range selector is non-inclusive,\nmeaning that, the index at the end is not included in the range that is\nselected from the array.\nTherefore, the syntax `start..end` actually means `start..end - 1` in practice.\n\nYou can for example, create a slice that goes from the first to the\nlast elements of the array, by using `ar[0..ar.len]` syntax\nIn other words, it is a slice that\naccess all elements in the array.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ar = [4]u8{48, 24, 12, 6};\nconst sl = ar[0..ar.len];\n_ = sl;\n```\n:::\n\n\n\n\nYou can also use the syntax `start..` in your range selector.\nWhich tells the `zig` compiler to select the portion of the array\nthat begins at the `start` index until the last element of the array.\nIn the example below, we are selecting the range from index 1\nuntil the end of the array.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst sl = ns[1..];\n_ = sl;\n```\n:::\n\n\n\n\n\n### More on slices\n\nAs we discussed before, in Zig, you can select specific portions of an existing\narray. This is called *slicing* in Zig [@zigguide], because when you select a portion\nof an array, you are creating a slice object from that array.\n\nA slice object is essentially a pointer object accompanied by a length number.\nThe pointer object points to the first element in the slice, and the\nlength number tells the `zig` compiler how many elements there are in this slice.\n\n> Slices can be thought of as a pair of `[*]T` (the pointer to the data) and a `usize` (the element count) [@zigguide].\n\nThrough the pointer contained inside the slice you can access the elements (or values)\nthat are inside this range (or portion) that you selected from the original array.\nBut the length number (which you can access through the `len` property of your slice object)\nis the really big improvement (over C arrays for example) that Zig brings to the table here.\n\nBecause with this length number\nthe `zig` compiler can easily check if you are trying to access an index that is out of the bounds of this particular slice,\nor, if you are causing any buffer overflow problems. In the example below,\nwe access the `len` property of the slice `sl`, which tells us that this slice\nhave 2 elements in it.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst sl = ns[1..3];\ntry stdout.print(\"{d}\\n\", .{sl.len});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n2\n```\n\n\n:::\n:::\n\n\n\n\n\n### Array operators\n\nThere are two array operators available in Zig that are very useful.\nThe array concatenation operator (`++`), and the array multiplication operator (`**`). As the name suggests,\nthese are array operators.\n\nOne important detail about these two operators is that they work\nonly when both operands have a size (or \"length\") that is compile-time known.\nWe are going to talk more about\nthe differences between \"compile-time known\" and \"runtime known\" at @sec-compile-time.\nBut for now, keep this information in mind, that you cannot use these operators in every situation.\n\nIn summary, the `++` operator creates a new array that is the concatenation,\nof both arrays provided as operands. So, the expression `a ++ b` produces\na new array which contains all the elements from arrays `a` and `b`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst a = [_]u8{1,2,3};\nconst b = [_]u8{4,5};\nconst c = a ++ b;\ntry stdout.print(\"{any}\\n\", .{c});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n{ 1, 2, 3, 4, 5 }\n```\n\n\n:::\n:::\n\n\n\n\nThis `++` operator is particularly useful to concatenate strings together.\nStrings in Zig are described in depth at @sec-zig-strings. In summary, a string object in Zig\nis essentially an arrays of bytes. So, you can use this array concatenation operator\nto effectively concatenate strings together.\n\nIn contrast, the `**` operator is used to replicate an array multiple\ntimes. In other words, the expression `a ** 3` creates a new array\nwhich contains the elements of the array `a` repeated 3 times.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst a = [_]u8{1,2,3};\nconst c = a ** 2;\ntry stdout.print(\"{any}\\n\", .{c});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n{ 1, 2, 3, 1, 2, 3 }\n```\n\n\n:::\n:::\n\n\n\n\n\n### Runtime versus compile-time known length in slices\n\nWe are going to talk a lot about the differences between compile-time known\nand runtime known across this book, especially at @sec-compile-time.\nBut the basic idea is that a thing is compile-time known, when we know\neverything (the value, the attributes and the characteristics) about this thing at compile-time.\nIn contrast, a runtime known thing is when the exact value of a thing is calculated only at runtime.\nTherefore, we don't know the value of this thing at compile-time, only at runtime.\n\nWe have learned at @sec-select-array-elem that slices are created by using a *range selector*,\nwhich represents a range of indexes. When this \"range of indexes\" (i.e. the start and the end of this range)\nis known at compile-time, the slice object that get's created is actually, under the hood, just\na single-item pointer to an array.\n\nYou don't need to precisely understand what that means now. We are going to talk a lot about pointers\nat @sec-pointer. For now, just understand that, when the range of indexes is known at compile-time,\nthe slice that get's created is just a pointer to an array, accompanied by a length value that\ntells the size of the slice.\n\nIf you have a slice object like this, i.e. a slice that has a compile-time known range,\nyou can use common pointer operations over this slice object. For example, you can \ndereference the pointer of this slice, by using the `.*` method, like you would\ndo on a normal pointer object.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst arr1 = [10]u64 {\n 1, 2, 3, 4, 5,\n 6, 7, 8, 9, 10\n};\n// This slice have a compile-time known range.\n// Because we know both the start and end of the range.\nconst slice = arr1[1..4];\n```\n:::\n\n\n\n\n\nOn the other hand, if the range of indexes is not known at compile time, then, the slice object\nthat get's created is not a pointer anymore, and, thus, it does not support pointer operations.\nFor example, maybe the start index is known at compile time, but the end index is not. In such\ncase, the range of the slice becomes runtime known only.\n\nIn the example below, the `slice` object have a runtime known range, because the end index of the range\nis not known at compile time. In other words, the size of the array at `buffer` is not known\nat compile time. When we execute this program, the size of the array might be 10, or, it might be 12\ndepending on where we execute it. Therefore, we don't know at compile time if\nthe slice object have a range of size 10, or, a range of size 12.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst builtin = @import(\"builtin\");\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n var n: usize = 0;\n if (builtin.target.os.tag == .windows) {\n n = 10;\n } else {\n n = 12;\n }\n const buffer = try allocator.alloc(u64, n);\n const slice = buffer[0..];\n _ = slice;\n}\n```\n:::\n\n\n\n\n\n## Blocks and scopes {#sec-blocks}\n\nBlocks are created in Zig by a pair of curly braces. A block is just a group of\nexpressions (or statements) contained inside of a pair of curly braces. All of these expressions that\nare contained inside of this pair of curly braces belongs to the same scope.\n\nIn other words, a block just delimits a scope in your code.\nThe objects that you define inside the same block belongs to the same\nscope, and, therefore, are accessible from within this scope.\nAt the same time, these objects are not accessible outside of this scope.\nSo, you could also say that blocks are used to limit the scope of the objects that you create in\nyour source code. In less technical terms, blocks are used to specify where in your source code\nyou can access whatever object you have in your source code.\n\nSo, a block is just a group of expressions contained inside a pair of curly braces.\nAnd every block have its own scope separated from the others.\nThe body of a function is a classic example of a block. If statements, for and while loops\n(and any other structure in the language that uses the pair of curly braces)\nare also examples of blocks.\n\nThis means that, every if statement, or for loop,\netc., that you create in your source code have its own separate scope.\nThat is why you can't access the objects that you defined inside\nof your for loop (or if statement) in an outer scope, i.e. a scope outside of the for loop.\nBecause you are trying to access an object that belongs to a scope that is different\nthan your current scope.\n\n\nYou can create blocks within blocks, with multiple levels of nesting.\nYou can also (if you want to) give a label to a particular block, with the colon character (`:`).\nJust write `label:` before you open the pair of curly braces that delimits your block. When you label a block\nin Zig, you can use the `break` keyword to return a value from this block, like as if it\nwas a function's body. You just write the `break` keyword, followed by the block label in the format `:label`,\nand the expression that defines the value that you want to return.\n\nLike in the example below, where we are returning the value from the `y` object\nfrom the block `add_one`, and saving the result inside the `x` object.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar y: i32 = 123;\nconst x = add_one: {\n y += 1;\n break :add_one y;\n};\nif (x == 124 and y == 124) {\n try stdout.print(\"Hey!\", .{});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHey!\n```\n\n\n:::\n:::\n\n\n\n\n\n\n\n\n## How strings work in Zig? {#sec-zig-strings}\n\nThe first project that we are going to build and discuss in this book is a base64 encoder/decoder (@sec-base64).\nBut in order for us to build such a thing, we need to get a better understanding on how strings work in Zig.\nSo let's discuss this specific aspect of Zig.\n\nIn Zig, a string literal value is just a pointer to a null-terminated array of bytes (i.e. the same thing as a C string).\nHowever, a string object in Zig is a little more than just a pointer. A string object\nin Zig is an object of type `[]const u8`, and, this object always contains two things: the\nsame null-terminated array of bytes that you would find in a string literal value, plus a length value.\nEach byte in this \"array of bytes\" is represented by an `u8` value, which is an unsigned 8 bit integer,\nso, it is equivalent to the C data type `unsigned char`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// This is a string literal value:\n\"A literal value\";\n// This is a string object:\nconst object: []const u8 = \"A string object\";\n```\n:::\n\n\n\n\nZig always assumes that this sequence of bytes is UTF-8 encoded. This might not be true for every\nsequence of bytes you're working with, but is not really Zig's job to fix the encoding of your strings\n(you can use [`iconv`](https://www.gnu.org/software/libiconv/)[^libiconv] for that).\nToday, most of the text in our modern world, especially on the web, should be UTF-8 encoded.\nSo if your string literal is not UTF-8 encoded, then, you will likely\nhave problems in Zig.\n\n[^libiconv]: \n\nLet’s take for example the word \"Hello\". In UTF-8, this sequence of characters (H, e, l, l, o)\nis represented by the sequence of decimal numbers 72, 101, 108, 108, 111. In hexadecimal, this\nsequence is `0x48`, `0x65`, `0x6C`, `0x6C`, `0x6F`. So if I take this sequence of hexadecimal values,\nand ask Zig to print this sequence of bytes as a sequence of characters (i.e. a string), then,\nthe text \"Hello\" will be printed into the terminal:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\n\npub fn main() !void {\n const bytes = [_]u8{0x48, 0x65, 0x6C, 0x6C, 0x6F};\n try stdout.print(\"{s}\\n\", .{bytes});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHello\n```\n\n\n:::\n:::\n\n\n\n\n\nIf you want to see the actual bytes that represents a string in Zig, you can use\na `for` loop to iterate through each byte in the string, and ask Zig to print each byte as an hexadecimal\nvalue to the terminal. You do that by using a `print()` statement with the `X` formatting specifier,\nlike you would normally do with the [`printf()` function](https://cplusplus.com/reference/cstdio/printf/)[^printfs] in C.\n\n[^printfs]: \n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_object = \"This is an example of string literal in Zig\";\n try stdout.print(\"Bytes that represents the string object: \", .{});\n for (string_object) |byte| {\n try stdout.print(\"{X} \", .{byte});\n }\n try stdout.print(\"\\n\", .{});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nBytes that represents the string object: 54 68 69 \n 73 20 69 73 20 61 6E 20 65 78 61 6D 70 6C 65 20 6F\n F 66 20 73 74 72 69 6E 67 20 6C 69 74 65 72 61 6C 2\n 20 69 6E 20 5A 69 67 \n```\n\n\n:::\n:::\n\n\n\n\n### Strings in C\n\nAt first glance, this looks very similar to how C treats strings as well. In more details, string values\nin C are treated internally as an array of arbitrary bytes, and this array is also null-terminated.\n\nBut one key difference between a Zig string and a C string, is that Zig also stores the length of\nthe array inside the string object. This small detail makes your code safer, because is much\neasier for the Zig compiler to check if you are trying to access an element that is \"out of bounds\", i.e. if\nyour trying to access memory that does not belong to you.\n\nTo achieve this same kind of safety in C, you have to do a lot of work that kind of seems pointless.\nSo getting this kind of safety is not automatic and much harder to do in C. For example, if you want\nto track the length of your string throughout your program in C, then, you first need to loop through\nthe array of bytes that represents this string, and find the null element (`'\\0'`) position to discover\nwhere exactly the array ends, or, in other words, to find how much elements the array of bytes contain.\n\nTo do that, you would need something like this in C. In this example, the C string stored in\nthe object `array` is 25 bytes long:\n\n\n\n\n::: {.cell}\n\n```{.c .cell-code}\n#include \nint main() {\n char* array = \"An example of string in C\";\n int index = 0;\n while (1) {\n if (array[index] == '\\0') {\n break;\n }\n index++;\n }\n printf(\"Number of elements in the array: %d\\n\", index);\n}\n```\n:::\n\n\n\n\n```\nNumber of elements in the array: 25\n```\n\nBut in Zig, you do not have to do this, because the object already contains a `len`\nfield which stores the length information of the array. As an example, the `string_object` object below is 43 bytes long:\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_object = \"This is an example of string literal in Zig\";\n try stdout.print(\"{d}\\n\", .{string_object.len});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n43\n```\n\n\n:::\n:::\n\n\n\n\n\n### A better look at the object type\n\nNow, we can inspect better the type of objects that Zig create. To check the type of any object in Zig, you can use the\n`@TypeOf()` function. If we look at the type of the `simple_array` object below, you will find that this object\nis a array of 4 elements. Each element is a signed integer of 32 bits which corresponds to the data type `i32` in Zig.\nThat is what an object of type `[4]i32` is.\n\nBut if we look closely at the type of the `string_object` object below, you will find that this object is a\nconstant pointer (hence the `*const` annotation) to an array of 43 elements (or 43 bytes). Each element is a\nsingle byte (more precisely, an unsigned 8 bit integer - `u8`), that is why we have the `[43:0]u8` portion of the type below.\nIn other words, the string stored inside the `string_object` object is 43 bytes long.\nThat is why you have the type `*const [43:0]u8` below.\n\nIn the case of `string_object`, it is a constant pointer (`*const`) because the object `string_object` is declared\nas constant in the source code (in the line `const string_object = ...`). So, if we changed that for some reason, if\nwe declare `string_object` as a variable object (i.e. `var string_object = ...`), then, `string_object` would be\njust a normal pointer to an array of unsigned 8-bit integers (i.e. `* [43:0]u8`).\n\nNow, if we create an pointer to the `simple_array` object, then, we get a constant pointer to an array of 4 elements (`*const [4]i32`),\nwhich is very similar to the type of the `string_object` object. This demonstrates that a string object (or a string literal)\nin Zig is already a pointer to an array.\n\nJust remember that a \"pointer to an array\" is different than an \"array\". So a string object in Zig is a pointer to an array\nof bytes, and not simply an array of bytes.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_object = \"This is an example of string literal in Zig\";\n const simple_array = [_]i32{1, 2, 3, 4};\n try stdout.print(\n \"Type of array object: {}\",\n .{@TypeOf(simple_array)}\n );\n try stdout.print(\n \"Type of string object: {}\",\n .{@TypeOf(string_object)}\n );\n try stdout.print(\n \"Type of a pointer that points to the array object: {}\",\n .{@TypeOf(&simple_array)}\n );\n}\n```\n:::\n\n\n\n\n```\nType of array object: [4]i32\nType of string object: *const [43:0]u8\nType of a pointer that points to\n the array object: *const [4]i32\n```\n\n\n### Byte vs unicode points\n\nIs important to point out that each byte in the array is not necessarily a single character.\nThis fact arises from the difference between a single byte and a single unicode point.\n\nThe encoding UTF-8 works by assigning a number (which is called a unicode point) to each character in\nthe string. For example, the character \"H\" is stored in UTF-8 as the decimal number 72. This means that\nthe number 72 is the unicode point for the character \"H\". Each possible character that can appear in a\nUTF-8 encoded string have its own unicode point.\n\nFor example, the Latin Capital Letter A With Stroke (Ⱥ) is represented by the number (or the unicode point)\n570. However, this decimal number (570) is higher than the maximum number stored inside a single byte, which\nis 255. In other words, the maximum decimal number that can be represented with a single byte is 255. That is why,\nthe unicode point 570 is actually stored inside the computer’s memory as the bytes `C8 BA`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_object = \"Ⱥ\";\n _ = try stdout.write(\n \"Bytes that represents the string object: \"\n );\n for (string_object) |char| {\n try stdout.print(\"{X} \", .{char});\n }\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nBytes that represents the string object: C8 BA \n```\n\n\n:::\n:::\n\n\n\n\n\nThis means that to store the character Ⱥ in an UTF-8 encoded string, we need to use two bytes together\nto represent the number 570. That is why the relationship between bytes and unicode points is not always\n1 to 1. Each unicode point is a single character in the string, but not always a single byte corresponds\nto a single unicode point.\n\nAll of this means that if you loop trough the elements of a string in Zig, you will be looping through the\nbytes that represents that string, and not through the characters of that string. In the Ⱥ example above,\nthe for loop needed two iterations (instead of a single iteration) to print the two bytes that represents this Ⱥ letter.\n\nNow, all english letters (or ASCII letters if you prefer) can be represented by a single byte in UTF-8. As a\nconsequence, if your UTF-8 string contains only english letters (or ASCII letters), then, you are lucky. Because\nthe number of bytes will be equal to the number of characters in that string. In other words, in this specific\nsituation, the relationship between bytes and unicode points is 1 to 1.\n\nBut on the other side, if your string contains other types of letters… for example, you might be working with\ntext data that contains, chinese, japanese or latin letters, then, the number of bytes necessary to represent\nyour UTF-8 string will likely be much higher than the number of characters in that string.\n\nIf you need to iterate through the characters of a string, instead of its bytes, then, you can use the\n`std.unicode.Utf8View` struct to create an iterator that iterates through the unicode points of your string.\n\nIn the example below, we loop through the japanese characters “アメリカ”. Each of the four characters in\nthis string is represented by three bytes. But the for loop iterates four times, one iteration for each\ncharacter/unicode point in this string:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n var utf8 = (\n (try std.unicode.Utf8View.init(\"アメリカ\"))\n .iterator()\n );\n while (utf8.nextCodepointSlice()) |codepoint| {\n try stdout.print(\n \"got codepoint {}\\n\",\n .{std.fmt.fmtSliceHexUpper(codepoint)}\n );\n }\n}\n```\n:::\n\n\n\n\n```\ngot codepoint E382A2\ngot codepoint E383A1\ngot codepoint E383AA\ngot codepoint E382AB\n```\n\n\n### Some useful functions for strings {#sec-strings-useful-funs}\n\nIn this section, I just want to quickly describe some functions from the Zig Standard Library\nthat are very useful to use when working with strings. Most notably:\n\n- `std.mem.eql()`: to compare if two strings are equal.\n- `std.mem.splitScalar()`: to split a string into an array of substrings given a delimiter value.\n- `std.mem.splitSequence()`: to split a string into an array of substrings given a substring delimiter.\n- `std.mem.startsWith()`: to check if string starts with substring.\n- `std.mem.endsWith()`: to check if string ends with substring.\n- `std.mem.trim()`: to remove specific values from both start and end of the string.\n- `std.mem.concat()`: to concatenate strings together.\n- `std.mem.count()`: to count the occurrences of substring in the string.\n- `std.mem.replace()`: to replace the occurrences of substring in the string.\n\nNotice that all of these functions come from the `mem` module of\nthe Zig Standard Library. This module contains multiple functions and methods\nthat are useful to work with memory and sequences of bytes in general.\n\nThe `eql()` function is used to check if two arrays of data are equal or not.\nSince strings are just arbitrary arrays of bytes, we can use this function to compare two strings together.\nThis function returns a boolean value indicating if the two strings are equal\nor not. The first argument of this function is the data type of the elements of the arrays\nthat are being compared.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst name: []const u8 = \"Pedro\";\ntry stdout.print(\n \"{any}\\n\", .{std.mem.eql(u8, name, \"Pedro\")}\n);\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntrue\n```\n\n\n:::\n:::\n\n\n\n\nThe `splitScalar()` and `splitSequence()` functions are useful to split\na string into multiple fragments, like the `split()` method from Python strings. The difference between these two\nmethods is that the `splitScalar()` uses a single character as the separator to\nsplit the string, while `splitSequence()` uses a sequence of characters (a.k.a. a substring)\nas the separator. There is a practical example of these functions later in the book.\n\nThe `startsWith()` and `endsWith()` functions are pretty straightforward. They\nreturn a boolean value indicating if the string (or, more precisely, if the array of data)\nbegins (`startsWith`) or ends (`endsWith`) with the sequence provided.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst name: []const u8 = \"Pedro\";\ntry stdout.print(\n \"{any}\\n\", .{std.mem.startsWith(u8, name, \"Pe\")}\n);\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntrue\n```\n\n\n:::\n:::\n\n\n\n\nThe `concat()` function, as the name suggests, concatenate two or more strings together.\nBecause the process of concatenating the strings involves allocating enough space to\naccomodate all the strings together, this `concat()` function receives an allocator\nobject as input.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst str1 = \"Hello\";\nconst str2 = \" you!\";\nconst str3 = try std.mem.concat(\n allocator, u8, &[_][]const u8{ str1, str2 }\n);\ntry stdout.print(\"{s}\\n\", .{str3});\n```\n:::\n\n\n\n\n```\nHello you!\n```\n\nAs you can imagine, the `replace()` function is used to replace substrings in a string by another substring.\nThis function works very similarly to the `replace()` method from Python strings. Therefore, you\nprovide a substring to search, and every time that the `replace()` function finds\nthis substring within the input string, it replaces this substring with the \"replacement substring\"\nthat you provided as input.\n\nIn the example below, we are taking the input string \"Hello\", and replacing all occurrences\nof the substring \"el\" inside this input string with \"34\", and saving the results inside the\n`buffer` object. As result, the `replace()` function returns an `usize` value that\nindicates how many replacements were performed.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst str1 = \"Hello\";\nvar buffer: [5]u8 = undefined;\nconst nrep = std.mem.replace(\n u8, str1, \"el\", \"34\", buffer[0..]\n);\ntry stdout.print(\"New string: {s}\\n\", .{buffer});\ntry stdout.print(\"N of replacements: {d}\\n\", .{nrep});\n```\n:::\n\n\n\n\n```\nNew string: H34lo\nN of replacements: 1\n```\n\n\n\n\n\n\n## Safety in Zig\n\nA general trend in modern low-level programming languages is safety. As our modern world\nbecome more interconnected with technology and computers,\nthe data produced by all of this technology becomes one of the most important\n(and also, one of the most dangerous) assets that we have.\n\nThis is probably the main reason why modern low-level programming languages\nhave been giving great attention to safety, especially memory safety, because\nmemory corruption is still the main target for hackers to exploit.\nThe reality is that we don't have an easy solution for this problem.\nFor now, we only have techniques and strategies that mitigates these\nproblems.\n\nAs Richard Feldman explains on his [most recent GOTO conference talk](https://www.youtube.com/watch?v=jIZpKpLCOiU&ab_channel=GOTOConferences)[^gotop]\n, we haven't figured it out yet a way to achieve **true safety in technology**.\nIn other words, we haven't found a way to build software that won't be exploited\nwith 100% certainty. We can greatly reduce the risks of our software being\nexploited, by ensuring memory safety for example. But this is not enough\nto achieve \"true safety\" territory.\n\nBecause even if you write your program in a \"safe language\", hackers can still\nexploit failures in the operational system where your program is running (e.g. maybe the\nsystem where your code is running have a \"backdoor exploit\" that can still\naffect your code in unexpected ways), or also, they can exploit the features\nfrom the architecture of your computer. A recently found exploit\nthat involves memory invalidation through a feature of \"memory tags\"\npresent in ARM chips is an example of that [@exploit1].\n\n[^gotop]: \n\nThe question is: what Zig and other languages have been doing to mitigate this problem?\nIf we take Rust as an example, Rust is, for the most part[^rust-safe], a memory safe\nlanguage by enforcing specific rules to the developer. In other words, the key feature\nof Rust, the *borrow checker*, forces you to follow a specific logic when you are writing\nyour Rust code, and the Rust compiler will always complain everytime you try to go out of this\npattern.\n\n[^rust-safe]: Actually, a lot of existing Rust code is still memory unsafe, because they communicate with external libraries through FFI (*foreign function interface*), which disables the borrow-checker features through the `unsafe` keyword.\n\n\nIn contrast, the Zig language is not a memory safe language by default.\nThere are some memory safety features that you get for free in Zig,\nespecially in arrays and pointer objects. But there are other tools\noffered by the language, that are not used by default.\nIn other words, the `zig` compiler does not obligates you to use such tools.\n\nThe tools listed below are related to memory safety. That is, they help you to achieve\nmemory safety in your Zig code:\n\n- `defer` allows you to keep free operations physically close to allocations. This helps you to avoid memory leaks, \"use after free\", and also \"double-free\" problems. Furthermore, it also keeps free operations logically tied to the end of the current scope, which greatly reduces the mental overhead about object lifetime.\n- `errdefer` helps you to guarantee that your program frees the allocated memory, even if a runtime error occurs.\n- pointers and objects are non-nullable by default. This helps you to avoid memory problems that might arise from de-referencing null pointers.\n- Zig offers some native types of allocators (called \"testing allocators\") that can detect memory leaks and double-frees. These types of allocators are widely used on unit tests, so they transform your unit tests into a weapon that you can use to detect memory problems in your code.\n- arrays and slices in Zig have their lengths embedded in the object itself, which makes the `zig` compiler very effective on detecting \"index out-of-range\" type of errors, and avoiding buffer overflows.\n\n\nDespite these features that Zig offers that are related to memory safety issues, the language\nalso have some rules that help you to achieve another type of safety, which is more related to\nprogram logic safety. These rules are:\n\n- pointers and objects are non-nullable by default. Which eliminates an edge case that might break the logic of your program.\n- switch statements must exaust all possible options.\n- the `zig` compiler forces you to handle every possible error in your program.\n\n\n## Other parts of Zig\n\nWe already learned a lot about Zig's syntax, and also, some pretty technical\ndetails about it. Just as a quick recap:\n\n- We talked about how functions are written in Zig at @sec-root-file and @sec-main-file.\n- How to create new objects/identifiers at @sec-root-file and especially at @sec-assignments.\n- How strings work in Zig at @sec-zig-strings.\n- How to use arrays and slices at @sec-arrays.\n- How to import functionality from other Zig modules at @sec-root-file.\n\n\nBut, for now, this amount of knowledge is enough for us to continue with this book.\nLater, over the next chapters we will still talk more about other parts of\nZig's syntax that are also equally important. Such as:\n\n\n- How Object-Oriented programming can be done in Zig through *struct declarations* at @sec-structs-and-oop.\n- Basic control flow syntax at @sec-zig-control-flow.\n- Enums at @sec-enum;\n- Pointers and Optionals at @sec-pointer;\n- Error handling with `try` and `catch` at @sec-error-handling;\n- Unit tests at @sec-unittests;\n- Vectors at @sec-vectors-simd;\n- Build System at @sec-build-system;\n\n\n\n\n", + "supporting": [ + "01-zig-weird_files" + ], "filters": [ "rmarkdown/pagebreak.lua" ], diff --git a/_freeze/Chapters/03-structs/execute-results/html.json b/_freeze/Chapters/03-structs/execute-results/html.json index 974fe13..034d22f 100644 --- a/_freeze/Chapters/03-structs/execute-results/html.json +++ b/_freeze/Chapters/03-structs/execute-results/html.json @@ -1,8 +1,8 @@ { - "hash": "7392fe11f8dca4ee6362c25ddd3e13cc", + "hash": "1a31e28666b2f5e98800c7d8a7daedfb", "result": { "engine": "knitr", - "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n\n\n\n\n# Control flow, structs, modules and types\n\nWe have discussed a lot of Zig's syntax in the last chapter,\nespecially in @sec-root-file and @sec-main-file.\nBut we still need to discuss some other very important\nelements of the language. Elements that you will use constantly on your day-to-day\nroutine.\n\nWe begin this chapter by discussing the different keywords and structures\nin Zig related to control flow (e.g. loops and if statements).\nThen, we talk about structs and how they can be used to do some\nbasic Object-Oriented (OOP) patterns in Zig. We also talk about\ntype inference and type casting.\nFinally, we end this chapter by discussing modules, and how they relate\nto structs.\n\n\n\n## Control flow {#sec-zig-control-flow}\n\nSometimes, you need to make decisions in your program. Maybe you need to decide\nwhether to execute or not a specific piece of code. Or maybe,\nyou need to apply the same operation over a sequence of values. These kinds of tasks,\ninvolve using structures that are capable of changing the \"control flow\" of our program.\n\nIn computer science, the term \"control flow\" usually refers to the order in which expressions (or commands)\nare evaluated in a given language or program. But this term is also used to refer\nto structures that are capable of changing this \"evaluation order\" of the commands\nexecuted by a given language/program.\n\nThese structures are better known\nby a set of terms, such as: loops, if/else statements, switch statements, among others. So,\nloops and if/else statements are examples of structures that can change the \"control\nflow\" of our program. The keywords `continue` and `break` are also examples of symbols\nthat can change the order of evaluation, since they can move our program to the next iteration\nof a loop, or make the loop stop completely.\n\n\n### If/else statements\n\nAn if/else statement performs a \"conditional flow operation\".\nA conditional flow control (or choice control) allows you to execute\nor ignore a certain block of commands based on a logical condition.\nMany programmers and computer science professionals also use\nthe term \"branching\" in this case.\nIn essence, an if/else statement allow us to use the result of a logical test\nto decide whether or not to execute a given block of commands.\n\nIn Zig, we write if/else statements by using the keywords `if` and `else`.\nWe start with the `if` keyword followed by a logical test inside a pair\nof parentheses, followed by a pair of curly braces which contains the lines\nof code to be executed in case the logical test returns the value `true`.\n\nAfter that, you can optionally add an `else` statement. To do that, just add the `else`\nkeyword followed by a pair of curly braces, with the lines of code\nto executed in case the logical test defined at `if` returns `false`.\n\nIn the example below, we are testing if the object `x` contains a number\nthat is greater than 10. Judging by the output printed to the console,\nwe know that this logical test returned `false`. Because the output\nin the console is compatible with the line of code present in the\n`else` branch of the if/else statement.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst x = 5;\nif (x > 10) {\n try stdout.print(\n \"x > 10!\\n\", .{}\n );\n} else {\n try stdout.print(\n \"x <= 10!\\n\", .{}\n );\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nx <= 10!\n```\n\n\n:::\n:::\n\n\n\n\n\n\n### Switch statements {#sec-switch}\n\nSwitch statements are also available in Zig, and they have a very similar syntax to a switch statement in Rust.\nAs you would expect, to write a switch statement in Zig we use the `switch` keyword.\nWe provide the value that we want to \"switch over\" inside a\npair of parentheses. Then, we list the possible combinations (or \"branchs\")\ninside a pair of curly braces.\n\nLet's take a look at the code example below. You can see that\nI'm creating an enum type called `Role`. We talk more about enums in @sec-enum.\nBut in summary, this `Role` type is listing different types of roles in a fictitious\ncompany, like `SE` for Software Engineer, `DE` for Data Engineer, `PM` for Product Manager,\netc.\n\nNotice that we are using the value from the `role` object in the\nswitch statement, to discover which exact area we need to store in the `area` variable object.\nAlso notice that we are using type inference inside the switch statement, with the dot character,\nas we are going to describe in @sec-type-inference.\nThis makes the `zig` compiler infer the correct data type of the values (`PM`, `SE`, etc.) for us.\n\nAlso notice that, we are grouping multiple values in the same branch of the switch statement.\nWe just separate each possible value with a comma. For example, if `role` contains either `DE` or `DA`,\nthe `area` variable would contain the value `\"Data & Analytics\"`, instead of `\"Platform\"` or `\"Sales\"`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nconst Role = enum {\n SE, DPE, DE, DA, PM, PO, KS\n};\n\npub fn main() !void {\n var area: []const u8 = undefined;\n const role = Role.SE;\n switch (role) {\n .PM, .SE, .DPE, .PO => {\n area = \"Platform\";\n },\n .DE, .DA => {\n area = \"Data & Analytics\";\n },\n .KS => {\n area = \"Sales\";\n },\n }\n try stdout.print(\"{s}\\n\", .{area});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nPlatform\n```\n\n\n:::\n:::\n\n\n\n\n\n#### Switch statements must exhaust all possibilities\n\nOne very important aspect about switch statements in Zig\nis that they must exhaust all existing possibilities.\nIn other words, all possible values that could be found inside the `order`\nobject must be explicitly handled in this switch statement.\n\nSince the `role` object have type `Role`, the only possible values to\nbe found inside this object are `PM`, `SE`, `DPE`, `PO`, `DE`, `DA` and `KS`.\nThere are no other possible values to be stored in this `role` object.\nThus, the switch statements must have a combination (branch) for each one of these values.\nThis is what \"exhaust all existing possibilities\" means. The switch statement covers\nevery possible case.\n\nTherefore, you cannot write a switch statement in Zig, and leave an edge case\nwith no explicit action to be taken.\nThis is a similar behaviour to switch statements in Rust, which also have to\nhandle all possible cases.\n\n\n\n#### The else branch\n\nTake a look at the `dump_hex_fallible()` function below as an example. This function\ncomes from the Zig Standard Library. More precisely, from the\n[`debug.zig` module](https://github.com/ziglang/zig/blob/master/lib/std/debug.zig)[^debug-mod].\nThere are multiple lines in this function, but I omitted them to focus solely on the\nswitch statement found in this function. Notice that this switch statement has four\npossible cases (i.e. four explicit branches). Also, notice that we used an `else` branch\nin this case.\n\nAn `else` branch in a switch statement works as the \"default branch\".\nWhenever you have multiple cases in your switch statement where\nyou want to apply the exact same action, you can use an `else` branch to do that.\n\n[^debug-mod]: \n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn dump_hex_fallible(bytes: []const u8) !void {\n // Many lines ...\n switch (byte) {\n '\\n' => try writer.writeAll(\"␊\"),\n '\\r' => try writer.writeAll(\"␍\"),\n '\\t' => try writer.writeAll(\"␉\"),\n else => try writer.writeByte('.'),\n }\n}\n```\n:::\n\n\n\n\nMany programmers would also use an `else` branch to handle a \"not supported\" case.\nThat is, a case that cannot be properly handled by your code, or, just a case that\nshould not be \"fixed\". Therefore, you can use an `else` branch to panic (or raise an error)\nin your program to stop the current execution.\n\nTake the code example below. We can see that, we are handling the cases\nfor the `level` object being either 1, 2, or 3. All other possible cases are not supported by default,\nand, as consequence, we raise a runtime error in such cases through the `@panic()` built-in function.\n\nAlso notice that, we are assigning the result of the switch statement to a new object called `category`.\nThis is another thing that you can do with switch statements in Zig. If a branch\noutputs a value as result, you can store the result value of the switch statement into\na new object.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst level: u8 = 4;\nconst category = switch (level) {\n 1, 2 => \"beginner\",\n 3 => \"professional\",\n else => {\n @panic(\"Not supported level!\");\n },\n};\ntry stdout.print(\"{s}\\n\", .{category});\n```\n:::\n\n\n\n\n```\nthread 13103 panic: Not supported level!\nt.zig:9:13: 0x1033c58 in main (switch2)\n @panic(\"Not supported level!\");\n ^\n```\n\n\n\n#### Using ranges in switch\n\nFurthermore, you can also use ranges of values in switch statements.\nThat is, you can create a branch in your switch statement that is used\nwhenever the input value is within the specified range. These \"range expressions\"\nare created with the operator `...`. It is important\nto emphasize that the ranges created by this operator are\ninclusive on both ends.\n\nFor example, I could easily change the previous code example to support all\nlevels between 0 and 100. Like this:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst level: u8 = 4;\nconst category = switch (level) {\n 0...25 => \"beginner\",\n 26...75 => \"intermediary\",\n 76...100 => \"professional\",\n else => {\n @panic(\"Not supported level!\");\n },\n};\ntry stdout.print(\"{s}\\n\", .{category});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nbeginner\n```\n\n\n:::\n:::\n\n\n\n\nThis is neat, and it works with character ranges too. That is, I could\nsimply write `'a'...'z'`, to match any character value that is a\nlowercase letter, and it would work fine.\n\n\n#### Labeled switch statements\n\nIn @sec-blocks we have talked about labeling blocks, and also, about using these labels\nto return a value from the block. Well, from version 0.14.0 and onwards of the `zig` compiler,\nyou can also apply labels over switch statements, which makes it possible to almost implement a\n\"C `goto`\" like pattern.\n\nFor example, if you give the label `xsw` to a switch statement, you can use this\nlabel in conjunction with the `continue` keyword to go back to the beginning of the switch\nstatement. In the example below, the execution goes back to the beginning of the\nswitch statement two times, before ending at the `3` branch.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nxsw: switch (@as(u8, 1)) {\n 1 => {\n try stdout.print(\"First branch\\n\", .{});\n continue :xsw 2;\n },\n 2 => continue :xsw 3,\n 3 => return,\n 4 => {},\n}\n```\n:::\n\n\n\n\n\n### The `defer` keyword {#sec-defer}\n\nWith the `defer` keyword you can register an expression to be executed when you exit the current scope.\nTherefore, this keyword has a similar functionality as the `on.exit()` function from R.\nTake the `foo()` function below as an example. When we execute this `foo()` function, the expression\nthat prints the message \"Exiting function ...\" is getting executed only when the function exits\nits scope.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nfn foo() !void {\n defer std.debug.print(\n \"Exiting function ...\\n\", .{}\n );\n try stdout.print(\"Adding some numbers ...\\n\", .{});\n const x = 2 + 2; _ = x;\n try stdout.print(\"Multiplying ...\\n\", .{});\n const y = 2 * 8; _ = y;\n}\n\npub fn main() !void {\n try foo();\n}\n```\n:::\n\n\n\n\n```\nAdding some numbers ...\nMultiplying ...\nExiting function ...\n```\n\nTherefore, we can use `defer` to declare an expression that is going to be executed\nwhen your code exits the current scope. Some programmers like to interpret the phrase \"exit of the current scope\"\nas \"the end of the current scope\". But this interpretation might not be entirely correct, depending\non what you consider as \"the end of the current scope\".\n\nI mean, what do you consider as **the end** of the current scope? Is it the closing curly bracket (`}`) of the scope?\nIs it when the last expression in the function get's executed? Is it when the function returns to the previous scope?\nEtc. For example, it would not be correct to interpret the \"exit of the current scope\" as the closing\ncurly bracket of the scope. Because the function might exit from an earlier position than this\nclosing curly bracket (e.g. an error value was generated at a previous line inside the function;\nthe function reached an earlier return statement; etc.). Anyway, just be careful with this interpretation.\n\nNow, if you remember of what we have discussed in @sec-blocks, there are multiple structures in the language\nthat create their own separate scopes. For/while loops, if/else statements,\nfunctions, normal blocks, etc. This also affects the interpretation of `defer`.\nFor example, if you use `defer` inside a for loop, then, the given expression\nwill be executed everytime this specific for loop exits its own scope.\n\nBefore we continue, is worth emphasizing that the `defer` keyword is an \"unconditional defer\".\nWhich means that the given expression will be executed no matter how the code exits\nthe current scope. For example, your code might exit the current scope because of an error value\nbeing generated, or, because of a return statement, or, a break statement, etc.\n\n\n\n### The `errdefer` keyword {#sec-errdefer1}\n\nOn the previous section, we have discussed the `defer` keyword, which you can use to\nregister an expression to be executed at the exit of the current scope.\nBut this keyword have a brother, which is the `errdefer` keyword. While `defer`\nis an \"unconditional defer\", the `errdefer` keyword is a \"conditional defer\".\nWhich means that the given expression is executed only when you exit the current\nscope on a very specific circumstance.\n\nIn more details, the expression given to `errdefer` is executed only when an error occurs in the current scope.\nTherefore, if the function (or for/while loop, if/else statement, etc.) exits the current scope\nin a normal situation, without errors, the expression given to `errdefer` is not executed.\n\nThis makes the `errdefer` keyword one of the many tools available in Zig for error handling.\nIn this section, we are more concerned with the control flow aspects around `errdefer`.\nBut we are going to discuss `errdefer` later as a error handling tool in @sec-errdefer2.\n\nThe code example below demonstrates three things:\n\n- that `defer` is an \"unconditional defer\", because the given expression get's executed regardless of how the function `foo()` exits its own scope.\n- that `errdefer` is executed because the function `foo()` returned an error value.\n- that `defer` and `errdefer` expressions are executed in a LIFO (*last in, first out*) order.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nfn foo() !void { return error.FooError; }\npub fn main() !void {\n var i: usize = 1;\n errdefer std.debug.print(\"Value of i: {d}\\n\", .{i});\n defer i = 2;\n try foo();\n}\n```\n:::\n\n\n\n\n```\nValue of i: 2\nerror: FooError\n/t.zig:6:5: 0x1037e48 in foo (defer)\n return error.FooError;\n ^\n```\n\n\nWhen I say that \"defer expressions\" are executed in a LIFO order, what I want to say is that\nthe last `defer` or `errdefer` expressions in the code are the first ones to be executed.\nYou could also interpret this as: \"defer expressions\" are executed from bottom to top, or,\nfrom last to first.\n\nTherefore, if I change the order of the `defer` and `errdefer` expressions, you will notice that\nthe value of `i` that get's printed to the console changes to 1. This doesn't mean that the\n`defer` expression was not executed in this case. This actually means that the `defer` expression\nwas executed only after the `errdefer` expression. The code example below demonstrates this:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nfn foo() !void { return error.FooError; }\npub fn main() !void {\n var i: usize = 1;\n defer i = 2;\n errdefer std.debug.print(\"Value of i: {d}\\n\", .{i});\n try foo();\n}\n```\n:::\n\n\n\n\n```\nValue of i: 1\nerror: FooError\n/t.zig:6:5: 0x1037e48 in foo (defer)\n return error.FooError;\n ^\n```\n\n\n\n\n### For loops\n\nA loop allows you to execute the same lines of code multiple times,\nthus, creating a \"repetition space\" in the execution flow of your program.\nLoops are particularly useful when we want to replicate the same function\n(or the same set of commands) over different inputs.\n\nThere are different types of loops available in Zig. But the most\nessential of them all is probably the *for loop*. A for loop is\nused to apply the same piece of code over the elements of a slice, or, an array.\n\nFor loops in Zig use a syntax that may be unfamiliar to programmers coming from\nother languages. You start with the `for` keyword, then, you\nlist the items that you want to iterate\nover inside a pair of parentheses. Then, inside of a pair of pipes (`|`)\nyou should declare an identifier that will serve as your iterator, or,\nthe \"repetition index of the loop\".\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfor (items) |value| {\n // code to execute\n}\n```\n:::\n\n\n\n\nTherefore, instead of using a `(value in items)` syntax,\nin Zig, for loops use the syntax `(items) |value|`. In the example\nbelow, you can see that we are looping through the items\nof the array stored at the object `name`, and printing to the\nconsole the decimal representation of each character in this array.\n\nIf we wanted, we could also iterate through a slice (or a portion) of\nthe array, instead of iterating through the entire array stored in the `name` object.\nJust use a range selector to select the section you want. For example,\nI could provide the expression `name[0..3]` to the for loop, to iterate\njust through the first 3 elements in the array.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst name = [_]u8{'P','e','d','r','o'};\nfor (name) |char| {\n try stdout.print(\"{d} | \", .{char});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n80 | 101 | 100 | 114 | 111 | \n```\n\n\n:::\n:::\n\n\n\n\nIn the above example we are using the value itself of each\nelement in the array as our iterator. But there are many situations where\nwe need to use an index instead of the actual values of the items.\n\nYou can do that by providing a second set of items to iterate over.\nMore precisely, you provide the range selector `0..` to the for loop. So,\nyes, you can use two different iterators at the same time in a for\nloop in Zig.\n\nBut remember from @sec-assignments that, every object\nyou create in Zig must be used in some way. So if you declare two iterators\nin your for loop, you must use both iterators inside the for loop body.\nBut if you want to use just the index iterator, and not use the \"value iterator\",\nthen, you can discard the value iterator by maching the\nvalue items to the underscore character, like in the example below:\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfor (name, 0..) |_, i| {\n try stdout.print(\"{d} | \", .{i});\n}\n```\n:::\n\n\n\n\n```\n0 | 1 | 2 | 3 | 4 |\n```\n\n\n### While loops\n\nA while loop is created from the `while` keyword. A `for` loop\niterates through the items of an array, but a `while` loop\nwill loop continuously, and infinitely, until a logical test\n(specified by you) becomes false.\n\nYou start with the `while` keyword, then, you define a logical\nexpression inside a pair of parentheses, and the body of the\nloop is provided inside a pair of curly braces, like in the example below:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar i: u8 = 1;\nwhile (i < 5) {\n try stdout.print(\"{d} | \", .{i});\n i += 1;\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n1 | 2 | 3 | 4 | \n```\n\n\n:::\n:::\n\n\n\n\nYou can also specify the increment expression to be used at the beginning of a while loop.\nTo do that, we write the increment expression inside a pair of parentheses after a colon character (`:`).\nThe code example below demonstrates this other pattern.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar i: u8 = 1;\nwhile (i < 5) : (i += 1) {\n try stdout.print(\"{d} | \", .{i});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n1 | 2 | 3 | 4 | \n```\n\n\n:::\n:::\n\n\n\n\n### Using `break` and `continue`\n\nIn Zig, you can explicitly stop the execution of a loop, or, jump to the next iteration of the loop, by using\nthe keywords `break` and `continue`, respectively. The `while` loop presented in the next code example is,\nat first sight, an infinite loop. Because the logical value inside the parenthese will always be equal to `true`.\nBut what makes this `while` loop stop when the `i` object reaches the count\n10? It is the `break` keyword!\n\nInside the while loop, we have an if statement that is constantly checking if the `i` variable\nis equal to 10. Since we are incrementing the value of `i` at each iteration of the\nwhile loop, this `i` object will eventually be equal to 10, and when it is, the if statement\nwill execute the `break` expression, and, as a result, the execution of the while loop is stopped.\n\nNotice the use of the `expect()` function from the Zig Standard Library after the while loop.\nThis `expect()` function is an \"assert\" type of function.\nThis function checks if the logical test provided is equal to true. If so, the function do nothing.\nOtherwise (i.e. the logical test is equal to false), the function raises an assertion error.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar i: usize = 0;\nwhile (true) {\n if (i == 10) {\n break;\n }\n i += 1;\n}\ntry std.testing.expect(i == 10);\ntry stdout.print(\"Everything worked!\", .{});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nEverything worked!\n```\n\n\n:::\n:::\n\n\n\n\nSince this code example was executed successfully by the `zig` compiler,\nwithout raising any errors, we known that, after the execution of the while loop,\nthe `i` object is equal to 10. Because if it wasn't equal to 10, an error would have\nbeen raised by `expect()`.\n\nNow, in the next example, we have a use case for\nthe `continue` keyword. The if statement is constantly\nchecking if the current index is a multiple of 2. If\nit is, we jump to the next iteration of the loop.\nOtherwise, the loop just prints the current index to the console.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [_]u8{1,2,3,4,5,6};\nfor (ns) |i| {\n if ((i % 2) == 0) {\n continue;\n }\n try stdout.print(\"{d} | \", .{i});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n1 | 3 | 5 | \n```\n\n\n:::\n:::\n\n\n\n\n\n\n## Function parameters are immutable {#sec-fun-pars}\n\nWe have already discussed a lot of the syntax behind function declarations in @sec-root-file and @sec-main-file.\nBut I want to emphasize a curious fact about function parameters (a.k.a. function arguments) in Zig.\nIn summary, function parameters are immutable in Zig.\n\nTake the code example below, where we declare a simple function that just tries to add\nsome amount to the input integer, and returns the result back. But if you look closely\nat the body of this `add2()` function, you will notice that we try\nto save the result back into the `x` function argument.\n\nIn other words, this function not only use the value that it received through the function argument\n`x`, but it also tries to change the value of this function argument, by assigning the addition result\ninto `x`. However, function arguments in Zig are immutable. You cannot change their values, or, you\ncannot assign values to them inside the body's function.\n\nThis is the reason why, the code example below do not compile successfully. If you try to compile\nthis code example, you get a compile error warning you that you are trying to change the value of a\nimmutable (i.e. constant) object.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nfn add2(x: u32) u32 {\n x = x + 2;\n return x;\n}\n\npub fn main() !void {\n const y = add2(4);\n std.debug.print(\"{d}\\n\", .{y});\n}\n```\n:::\n\n\n\n\n```\nt.zig:3:5: error: cannot assign to constant\n x = x + 2;\n ^\n```\n\n\nIf a function argument receives as input an object whose data type is\nany of the primitive types that we have listed in @sec-primitive-data-types,\nthis object is always passed by value to the function. In other words, this object\nis copied into the function stack frame.\n\nHowever, if the input object have a more complex data type, for example, it might\nbe a struct instance, or an array, or an union value, etc., in cases like that, the `zig` compiler\nwill take the liberty of deciding for you which strategy is best. Thus, the `zig` compiler will\npass your object to the function either by value, or by reference. The compiler will always\nchoose the strategy that is faster for you.\nThis optimization that you get for free is possible only because function arguments are\nimmutable in Zig.\n\nThere are some situations where you might need to change the value of your function argument\ndirectly inside the function's body. This happens more often when we are passing\nC structs as inputs to Zig functions.\n\nIn a situation like this, you can overcome this barrier of immutable function arguments, by simply taking the lead,\nand explicitly choosing to pass the object by reference to the function.\nThat is, instead of depending on the `zig` compiler to decide which strategy is best, you have\nto explicitly mark the function argument as a pointer. This way, we are telling the compiler\nthat this function argument will be passed by reference to the function.\n\nBy making it a pointer, we can finally alter the value of this function argument directly inside\nthe body of the `add2()` function. You can see that the code example below compiles successfully.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nfn add2(x: *u32) void {\n const d: u32 = 2;\n x.* = x.* + d;\n}\n\npub fn main() !void {\n var x: u32 = 4;\n add2(&x);\n std.debug.print(\"Result: {d}\\n\", .{x});\n}\n```\n:::\n\n\n\n\n```\nResult: 6\n```\n\n\n\n## Structs and OOP {#sec-structs-and-oop}\n\nZig is a language more closely related to C (which is a procedural language),\nthan it is to C++ or Java (which are object-oriented languages). Because of that, you do not\nhave advanced OOP (Object-Oriented Programming) patterns available in Zig, such as classes, interfaces or\nclass inheritance. Nonetheless, OOP in Zig is still possible by using struct definitions.\n\nWith struct definitions, you can create (or define) a new data type in Zig. These struct definitions work the same way as they work in C.\nYou give a name to this new struct (or, to this new data type you are creating), then, you list the data members of this new struct. You can\nalso register functions inside this struct, and they become the methods of this particular struct (or data type), so that, every object\nthat you create with this new type, will always have these methods available and associated with them.\n\nIn C++, when we create a new class, we normally have a constructor method (or, a constructor function) which\nis used to construct (or, to instantiate) every object of this particular class, and we also have\na destructor method (or a destructor function), which is the function responsible for destroying\nevery object of this class.\n\nIn Zig, we normally declare the constructor and the destructor methods\nof our structs, by declaring an `init()` and a `deinit()` methods inside the struct.\nThis is just a naming convention that you will find across the entire Zig Standard Library.\nSo, in Zig, the `init()` method of a struct is normally the constructor method of the class represented by this struct.\nWhile the `deinit()` method is the method used for destroying an existing instance of that struct.\n\nThe `init()` and `deinit()` methods are both used extensively in Zig code, and you will see both of\nthem being used when we talk about allocators in @sec-allocators.\nBut, as another example, let's build a simple `User` struct to represent an user of some sort of system.\n\nIf you look at the `User` struct below, you can see the `struct` keyword.\nNotice the data members of this struct, `id`, `name` and `email`. Every data member have its\ntype explicitly annotated, with the colon character (`:`) syntax that we described earlier in @sec-root-file.\nBut also notice that every line in the struct body that describes a data member, ends with a comma character (`,`).\nSo every time you declare a data member in your Zig code, always end the line with a comma character, instead\nof ending it with the traditional semicolon character (`;`).\n\nNext, we have registered an `init()` function as a method\nof this `User` struct. This `init()` method is the constructor method that we will use to instantiate\nevery new `User` object. That is why this `init()` function returns a new `User` object as result.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nconst User = struct {\n id: u64,\n name: []const u8,\n email: []const u8,\n\n pub fn init(id: u64,\n name: []const u8,\n email: []const u8) User {\n\n return User {\n .id = id,\n .name = name,\n .email = email\n };\n }\n\n pub fn print_name(self: User) !void {\n try stdout.print(\"{s}\\n\", .{self.name});\n }\n};\n\npub fn main() !void {\n const u = User.init(1, \"pedro\", \"email@gmail.com\");\n try u.print_name();\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\npedro\n```\n\n\n:::\n:::\n\n\n\n\nThe `pub` keyword plays an important role in struct declarations, and OOP in Zig.\nEvery method that you declare in your struct that is marked with the keyword `pub`,\nbecomes a public method of this particular struct.\n\nSo every method that you create inside your struct, is, at first, a private method\nof that struct. Meaning that, this method can only be called from within this\nstruct. But, if you mark this method as public, with the keyword `pub`, then,\nyou can call the method directly from an instance of the `User` struct.\n\nIn other words, the functions marked by the keyword `pub`\nare members of the public API of that struct.\nFor example, if I did not marked the `print_name()` method as public,\nthen, I could not execute the line `u.print_name()`. Because I would\nnot be authorized to call this method directly in my code.\n\n\n\n### Anonymous struct literals {#sec-anonymous-struct-literals}\n\nYou can declare a struct object as a literal value. When we do that, we normally specify the\ndata type of this struct literal by writing its data type just before the opening curly brace.\nFor example, I could write a struct literal value of the type `User` that we have defined\nin the previous section like this:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst eu = User {\n .id = 1,\n .name = \"Pedro\",\n .email = \"someemail@gmail.com\"\n};\n_ = eu;\n```\n:::\n\n\n\n\nHowever, in Zig, we can also write an anonymous struct literal. That is, you can write a\nstruct literal, but not specify explicitly the type of this particular struct.\nAn anonymous struct is written by using the syntax `.{}`. So, we essentially\nreplaced the explicit type of the struct literal with a dot character (`.`).\n\nAs we described in @sec-type-inference, when you put a dot before a struct literal,\nthe type of this struct literal is automatically inferred by the `zig` compiler.\nIn essence, the `zig` compiler will look for some hint of what is the type of that struct.\nThis hint can be the type annotation of a function argument,\nor the return type annotation of the function that you are using, or the type annotation\nof an existing object.\nIf the compiler does find such type annotation, it will use this\ntype in your literal struct.\n\nAnonymous structs are very common to be used as inputs to function arguments in Zig.\nOne example that you have seen already constantly, is the `print()`\nfunction from the `stdout` object.\nThis function takes two arguments.\nThe first argument, is a template string, which should\ncontain string format specifiers in it, which tells how the values provided\nin the second argument should be printed into the message.\n\nWhile the second argument is a struct literal that lists the values\nto be printed into the template message specified in the first argument.\nYou normally want to use an anonymous struct literal here, so that, the\n`zig` compiler do the job of specifying the type of this particular\nanonymous struct for you.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\npub fn main() !void {\n const stdout = std.io.getStdOut().writer();\n try stdout.print(\"Hello, {s}!\\n\", .{\"world\"});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHello, world!\n```\n\n\n:::\n:::\n\n\n\n\n\n\n### Struct declarations must be constant\n\nTypes in Zig must be `const` or `comptime` (we are going to talk more about comptime in @sec-comptime).\nWhat this means is that you cannot create a new data type, and mark it as variable with the `var` keyword.\nSo struct declarations are always constant. You cannot declare a new struct type using the `var` keyword.\nIt must be `const`.\n\nIn the `Vec3` example below, this declaration is allowed because I'm using the `const` keyword\nto declare this new data type.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst Vec3 = struct {\n x: f64,\n y: f64,\n z: f64,\n};\n```\n:::\n\n\n\n\n\n### The `self` method argument {#sec-self-arg}\n\nIn every language that have OOP, when we declare a method of some class or struct, we\nusually declare this method as a function that has a `self` argument.\nThis `self` argument is the reference to the object itself from which the method\nis being called from.\n\nIt is not mandatory to use this `self` argument. But why would you not use this `self` argument?\nThere is no reason to not use it. Because the only way to get access to the data stored in the\ndata members of your struct is to access them through this `self` argument.\nIf you don't need to use the data in the data members of your struct inside your method, you very likely don't need\na method. You can just declare this logic as a simple function, outside of your\nstruct declaration.\n\n\nTake the `Vec3` struct below. Inside this `Vec3` struct we declared a method named `distance()`.\nThis method calculates the distance between two `Vec3` objects, by following the distance\nformula in euclidean space. Notice that this `distance()` method takes two `Vec3` objects\nas input, `self` and `other`.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst m = std.math;\nconst Vec3 = struct {\n x: f64,\n y: f64,\n z: f64,\n\n pub fn distance(self: Vec3, other: Vec3) f64 {\n const xd = m.pow(f64, self.x - other.x, 2.0);\n const yd = m.pow(f64, self.y - other.y, 2.0);\n const zd = m.pow(f64, self.z - other.z, 2.0);\n return m.sqrt(xd + yd + zd);\n }\n};\n```\n:::\n\n\n\n\n\nThe `self` argument corresponds to the `Vec3` object from which this `distance()` method\nis being called from. While the `other` is a separate `Vec3` object that is given as input\nto this method. In the example below, the `self` argument corresponds to the object\n`v1`, because the `distance()` method is being called from the `v1` object,\nwhile the `other` argument corresponds to the object `v2`.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst v1 = Vec3 {\n .x = 4.2, .y = 2.4, .z = 0.9\n};\nconst v2 = Vec3 {\n .x = 5.1, .y = 5.6, .z = 1.6\n};\n\nstd.debug.print(\n \"Distance: {d}\\n\",\n .{v1.distance(v2)}\n);\n```\n:::\n\n\n\n\n```\nDistance: 3.3970575502926055\n```\n\n\n\n### About the struct state\n\nSometimes you don't need to care about the state of your struct object. Sometimes, you just need\nto instantiate and use the objects, without altering their state. You can notice that when you have methods\ninside your struct declaration that might use the values that are present in the data members, but they\ndo not alter the values in these data members of the struct in anyway.\n\nThe `Vec3` struct that was presented in @sec-self-arg is an example of that.\nThis struct have a single method named `distance()`, and this method do use the values\npresent in all three data members of the struct (`x`, `y` and `z`). But at the same time,\nthis method do not change the values of these data members in any point.\n\nAs a result of that, when we create `Vec3` objects we usually create them as\nconstant objects, like the `v1` and `v2` objects presented in @sec-self-arg.\nWe can create them as variable objects with the `var` keyword,\nif we want to. But because the methods of this `Vec3` struct do not change\nthe state of the objects in any point, it's unnecessary to mark them\nas variable objects.\n\nBut why? Why am I talking about this here? It's because the `self` argument\nin the methods is affected depending on whether the\nmethods present in a struct change or don't change the state of the object itself.\nMore specifically, when you have a method in a struct that changes the state\nof the object (i.e. change the value of a data member), the `self` argument\nin this method must be annotated in a different manner.\n\nAs I described in @sec-self-arg, the `self` argument in methods of\na struct is the argument that receives as input the object from which the method\nwas called from. We usually annotate this argument in the methods by writing `self`,\nfollowed by the colon character (`:`), and the data type of the struct to which\nthe method belongs to (e.g. `User`, `Vec3`, etc.).\n\nIf we take the `Vec3` struct that we defined in the previous section as an example,\nwe can see in the `distance()` method that this `self` argument is annotated as\n`self: Vec3`. Because the state of the `Vec3` object is never altered by this\nmethod.\n\nBut what if we do have a method that alters the state of the object, by altering the\nvalues of its data members, how should we annotate `self` in this instance? The answer is:\n\"we should annotate `self` as a pointer of `x`, instead of just `x`\".\nIn other words, you should annotate `self` as `self: *x`, instead of annotating it\nas `self: x`.\n\nIf we create a new method inside the `Vec3` object that, for example, expands the\nvector by multiplying its coordinates by a factor of two, then, we need to follow\nthis rule specified in the previous paragraph. The code example below demonstrates\nthis idea:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst m = std.math;\nconst Vec3 = struct {\n x: f64,\n y: f64,\n z: f64,\n\n pub fn distance(self: Vec3, other: Vec3) f64 {\n const xd = m.pow(f64, self.x - other.x, 2.0);\n const yd = m.pow(f64, self.y - other.y, 2.0);\n const zd = m.pow(f64, self.z - other.z, 2.0);\n return m.sqrt(xd + yd + zd);\n }\n\n pub fn double(self: *Vec3) void {\n self.x = self.x * 2.0;\n self.y = self.y * 2.0;\n self.z = self.z * 2.0;\n }\n};\n```\n:::\n\n\n\n\nNotice in the code example above that we have added a new method\nto our `Vec3` struct named `double()`. This method doubles the\ncoordinate values of our vector object. In the\ncase of the `double()` method, we annotated the `self` argument as `*Vec3`,\nindicating that this argument receives a pointer (or a reference, if you prefer to call it this way)\nto a `Vec3` object as input.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar v3 = Vec3 {\n .x = 4.2, .y = 2.4, .z = 0.9\n};\nv3.double();\nstd.debug.print(\"Doubled: {d}\\n\", .{v3.x});\n```\n:::\n\n\n\n\n```\nDoubled: 8.4\n```\n\n\n\nNow, if you change the `self` argument in this `double()` method to `self: Vec3`, like in the\n`distance()` method, you will get the compiler error exposed below as result. Notice that this\nerror message is showing a line from the `double()` method body,\nindicating that you cannot alter the value of the `x` data member.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// If we change the function signature of double to:\n pub fn double(self: Vec3) void {\n```\n:::\n\n\n\n\n```\nt.zig:16:13: error: cannot assign to constant\n self.x = self.x * 2.0;\n ~~~~^~\n```\n\nThis error message indicates that the `x` data member belongs to a constant object,\nand, because of that, it cannot be changed. Ultimately, this error message\nis telling us that the `self` argument is constant.\n\nIf you take some time, and think hard about this error message, you will understand it.\nYou already have the tools to understand why we are getting this error message.\nWe have talked about it already in @sec-fun-pars.\nSo remember, every function argument is immutable in Zig, and `self`\nis no exception to this rule.\n\nIn this example, we marked the `v3` object as a variable object.\nBut this does not matter. Because it is not about the input object, it is about\nthe function argument.\n\nThe problem begins when we try to alter the value of `self` directly, which is a function argument,\nand, every function argument is immutable by default. You may ask yourself how can we overcome\nthis barrier, and once again, the solution was also discussed in @sec-fun-pars.\nWe overcome this barrier, by explicitly marking the `self` argument as a pointer.\n\n\n::: {.callout-note}\nIf a method of your `x` struct alters the state of the object, by\nchanging the value of any data member, then, remember to use `self: *x`,\ninstead of `self: x` in the function signature of this method.\n:::\n\n\nYou could also interpret the content discussed in this section as:\n\"if you need to alter the state of your `x` struct object in one of its methods,\nyou must explicitly pass the `x` struct object by reference to the `self` argument of this method\".\n\n\n\n## Type inference {#sec-type-inference}\n\nZig is a strongly typed language. But, there are some situations\nwhere you don't have to explicitly write the type of every single object in your source code,\nas you would expect from a traditional strongly typed language, such as C and C++.\n\nIn some situations, the `zig` compiler can use type inference to solves the data types for you, easing some of\nthe burden that you carry as a developer.\nThe most common way this happens is through function arguments that receives struct objects\nas input.\n\nIn general, type inference in Zig is done by using the dot character (`.`).\nEverytime you see a dot character written before a struct literal, or before an enum value, or something like that,\nyou know that this dot character is playing a special party in this place. More specifically, it is\ntelling the `zig` compiler something along the lines of: \"Hey! Can you infer the type of this\nvalue for me? Please!\". In other words, this dot character is playing a similar role as the `auto` keyword in C++.\n\nI gave you some examples of this in @sec-anonymous-struct-literals, where we used anonymous struct literals.\nAnonymous struct literals are, struct literals that use type inference to\ninfer the exact type of this particular struct literal.\nThis type inference is done by looking for some minimal hint of the correct data type to be used.\nYou could say that the `zig` compiler looks for any neighbouring type annotation that might tell him\nwhat would be the correct type.\n\nAnother common place where we use type inference in Zig is at switch statements (which we talked about in @sec-switch).\nI also gave some other examples of type inference in @sec-switch, where we were inferring the data types of enum values listed inside\nof switch statements (e.g. `.DE`).\nBut as another example, take a look at this `fence()` function reproduced below,\nwhich comes from the [`atomic.zig` module](https://github.com/ziglang/zig/blob/master/lib/std/atomic.zig)[^fence-fn]\nof the Zig Standard Library.\n\n[^fence-fn]: .\n\nThere are a lot of things in this function that we haven't talked about yet, such as:\nwhat `comptime` means? `inline`? `extern`?\nLet's just ignore all of these things, and focus solely on the switch statement\nthat is inside this function.\n\nWe can see that this switch statement uses the `order` object as input. This `order`\nobject is one of the inputs of this `fence()` function, and we can see in the type annotation,\nthat this object is of type `AtomicOrder`. We can also see a bunch of values inside the\nswitch statements that begin with a dot character, such as `.release` and `.acquire`.\n\nBecause these weird values contain a dot character before them, we are asking the `zig`\ncompiler to infer the types of these values inside the switch statement. Then, the `zig`\ncompiler is looking into the current context where these values are being used, and it is\ntrying to infer the types of these values.\n\nSince they are being used inside a switch statement, the `zig` compiler looks into the type\nof the input object given to the switch statement, which is the `order` object in this case.\nBecause this object have type `AtomicOrder`, the `zig` compiler infers that these values\nare data members from this type `AtomicOrder`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub inline fn fence(self: *Self, comptime order: AtomicOrder) void {\n // many lines of code ...\n if (builtin.sanitize_thread) {\n const tsan = struct {\n extern \"c\" fn __tsan_acquire(addr: *anyopaque) void;\n extern \"c\" fn __tsan_release(addr: *anyopaque) void;\n };\n\n const addr: *anyopaque = self;\n return switch (order) {\n .unordered, .monotonic => @compileError(\n @tagName(order)\n ++ \" only applies to atomic loads and stores\"\n ),\n .acquire => tsan.__tsan_acquire(addr),\n .release => tsan.__tsan_release(addr),\n .acq_rel, .seq_cst => {\n tsan.__tsan_acquire(addr);\n tsan.__tsan_release(addr);\n },\n };\n }\n\n return @fence(order);\n}\n```\n:::\n\n\n\n\nThis is how basic type inference is done in Zig. If we didn't use the dot character before\nthe values inside this switch statement, then, we would be forced to write explicitly\nthe data types of these values. For example, instead of writing `.release` we would have to\nwrite `AtomicOrder.release`. We would have to do this for every single value\nin this switch statement, and this is a lot of work. That is why type inference\nis commonly used on switch statements in Zig.\n\n\n\n## Type casting {#sec-type-cast}\n\nIn this section, I want to discuss type casting (or, type conversion) with you.\nWe use type casting when we have an object of type \"x\", and we want to convert\nit into an object of type \"y\", i.e. we want to change the data type of the object.\n\nMost languages have a formal way to perform type casting. In Rust for example, we normally\nuse the keyword `as`, and in C, we normally use the type casting syntax, e.g. `(int) x`.\nIn Zig, we use the `@as()` built-in function to cast an object of type \"x\", into\nan object of type \"y\".\n\nThis `@as()` function is the preferred way to perform type conversion (or type casting)\nin Zig. Because it is explicit, and, it also performs the casting only if it\nis unambiguous and safe. To use this function, you just provide the target data type\nin the first argument, and, the object that you want cast at the second argument.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst expect = std.testing.expect;\ntest {\n const x: usize = 500;\n const y = @as(u32, x);\n try expect(@TypeOf(y) == u32);\n}\n```\n:::\n\n\n\n\nThis is the general way to perform type casting in Zig. But remember, `@as()` works only when casting\nis unambiguous and safe, and there are situations where these assumptions do not hold. For example,\nwhen casting an integer value into a float value, or vice-versa, it is not clear to the compiler\nhow to perform this conversion safely.\n\nTherefore, we need to use specialized \"casting functions\" in such situations.\nFor example, if you want to cast an integer value into a float value, then, you\nshould use the `@floatFromInt()` function. In the inverse scenario, you should use\nthe `@intFromFloat()` function.\n\nIn these functions, you just provide the object that you want to\ncast as input. Then, the target data type of the \"type casting operation\" is determined by\nthe type annotation of the object where you are saving the results.\nIn the example below, we are casting the object `x` into a value of type `f32`,\nbecause the object `y`, which is where we are saving the results, is annotated\nas an object of type `f32`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst expect = std.testing.expect;\ntest {\n const x: usize = 565;\n const y: f32 = @floatFromInt(x);\n try expect(@TypeOf(y) == f32);\n}\n```\n:::\n\n\n\n\nAnother built-in function that is very useful when performing type casting operations is `@ptrCast()`.\nIn essence, we use the `@as()` built-in function when we want to explicit convert (or cast) a Zig value/object\nfrom a type \"x\" to a type \"y\", etc. However, pointers (we are going to discuss pointers\nin more depth in @sec-pointer) are a special type of object in Zig,\ni.e. they are treated differently from \"normal objects\".\n\nEverytime a pointer is involved in some \"type casting operation\" in Zig, the `@ptrCast()` function is used.\nThis function works similarly to `@floatFromInt()`.\nYou just provide the pointer object that you want to cast as input to this function, and the\ntarget data type is, once again, determined by the type annotation of the object where the results are being\nstored.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst expect = std.testing.expect;\ntest {\n const bytes align(@alignOf(u32)) = [_]u8{\n 0x12, 0x12, 0x12, 0x12\n };\n const u32_ptr: *const u32 = @ptrCast(&bytes);\n try expect(@TypeOf(u32_ptr) == *const u32);\n}\n```\n:::\n\n\n\n\n\n\n\n\n## Modules\n\nWe already talked about what modules are, and also, how to import other modules into\nyour current module via *import statements*. Every Zig module (i.e. a `.zig` file) that you write in your project\nis internally stored as a struct object. Take the line exposed below as an example. In this line we are importing the\nZig Standard Library into our current module.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\n```\n:::\n\n\n\n\nWhen we want to access the functions and objects from the standard library, we\nare basically accessing the data members of the struct stored in the `std`\nobject. That is why we use the same syntax that we use in normal structs, with the dot operator (`.`)\nto access the data members and methods of the struct.\n\nWhen this \"import statement\" get's executed, the result of this expression is a struct\nobject that contains the Zig Standard Library modules, global variables, functions, etc.\nAnd this struct object get's saved (or stored) inside the constant object named `std`.\n\n\nTake the [`thread_pool.zig` module from the project `zap`](https://github.com/kprotty/zap/blob/blog/src/thread_pool.zig)[^thread]\nas an example. This module is written as if it was\na big struct. That is why we have a top-level and public `init()` method\nwritten in this module. The idea is that all top-level functions written in this\nmodule are methods from the struct, and all top-level objects and struct declarations\nare data members of this struct. The module is the struct itself.\n\n[^thread]: \n\n\nSo you would import and use this module by doing something like this:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst ThreadPool = @import(\"thread_pool.zig\");\nconst num_cpus = std.Thread.getCpuCount()\n catch @panic(\"failed to get cpu core count\");\nconst num_threads = std.math.cast(u16, num_cpus)\n catch std.math.maxInt(u16);\nconst pool = ThreadPool.init(\n .{ .max_threads = num_threads }\n);\n```\n:::\n", + "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n\n\n\n\n# Control flow, structs, modules and types\n\nWe have discussed a lot of Zig's syntax in the last chapter,\nespecially in @sec-root-file and @sec-main-file.\nBut we still need to discuss some other very important\nelements of the language. Elements that you will use constantly on your day-to-day\nroutine.\n\nWe begin this chapter by discussing the different keywords and structures\nin Zig related to control flow (e.g. loops and if statements).\nThen, we talk about structs and how they can be used to do some\nbasic Object-Oriented (OOP) patterns in Zig. We also talk about\ntype inference and type casting.\nFinally, we end this chapter by discussing modules, and how they relate\nto structs.\n\n\n\n## Control flow {#sec-zig-control-flow}\n\nSometimes, you need to make decisions in your program. Maybe you need to decide\nwhether to execute or not a specific piece of code. Or maybe,\nyou need to apply the same operation over a sequence of values. These kinds of tasks,\ninvolve using structures that are capable of changing the \"control flow\" of our program.\n\nIn computer science, the term \"control flow\" usually refers to the order in which expressions (or commands)\nare evaluated in a given language or program. But this term is also used to refer\nto structures that are capable of changing this \"evaluation order\" of the commands\nexecuted by a given language/program.\n\nThese structures are better known\nby a set of terms, such as: loops, if/else statements, switch statements, among others. So,\nloops and if/else statements are examples of structures that can change the \"control\nflow\" of our program. The keywords `continue` and `break` are also examples of symbols\nthat can change the order of evaluation, since they can move our program to the next iteration\nof a loop, or make the loop stop completely.\n\n\n### If/else statements\n\nAn if/else statement performs a \"conditional flow operation\".\nA conditional flow control (or choice control) allows you to execute\nor ignore a certain block of commands based on a logical condition.\nMany programmers and computer science professionals also use\nthe term \"branching\" in this case.\nIn essence, an if/else statement allow us to use the result of a logical test\nto decide whether or not to execute a given block of commands.\n\nIn Zig, we write if/else statements by using the keywords `if` and `else`.\nWe start with the `if` keyword followed by a logical test inside a pair\nof parentheses, followed by a pair of curly braces which contains the lines\nof code to be executed in case the logical test returns the value `true`.\n\nAfter that, you can optionally add an `else` statement. To do that, just add the `else`\nkeyword followed by a pair of curly braces, with the lines of code\nto executed in case the logical test defined at `if` returns `false`.\n\nIn the example below, we are testing if the object `x` contains a number\nthat is greater than 10. Judging by the output printed to the console,\nwe know that this logical test returned `false`. Because the output\nin the console is compatible with the line of code present in the\n`else` branch of the if/else statement.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst x = 5;\nif (x > 10) {\n try stdout.print(\n \"x > 10!\\n\", .{}\n );\n} else {\n try stdout.print(\n \"x <= 10!\\n\", .{}\n );\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nx <= 10!\n```\n\n\n:::\n:::\n\n\n\n\n\n\n### Switch statements {#sec-switch}\n\nSwitch statements are also available in Zig, and they have a very similar syntax to a switch statement in Rust.\nAs you would expect, to write a switch statement in Zig we use the `switch` keyword.\nWe provide the value that we want to \"switch over\" inside a\npair of parentheses. Then, we list the possible combinations (or \"branchs\")\ninside a pair of curly braces.\n\nLet's take a look at the code example below. You can see that\nI'm creating an enum type called `Role`. We talk more about enums in @sec-enum.\nBut in summary, this `Role` type is listing different types of roles in a fictitious\ncompany, like `SE` for Software Engineer, `DE` for Data Engineer, `PM` for Product Manager,\netc.\n\nNotice that we are using the value from the `role` object in the\nswitch statement, to discover which exact area we need to store in the `area` variable object.\nAlso notice that we are using type inference inside the switch statement, with the dot character,\nas we are going to describe in @sec-type-inference.\nThis makes the `zig` compiler infer the correct data type of the values (`PM`, `SE`, etc.) for us.\n\nAlso notice that, we are grouping multiple values in the same branch of the switch statement.\nWe just separate each possible value with a comma. For example, if `role` contains either `DE` or `DA`,\nthe `area` variable would contain the value `\"Data & Analytics\"`, instead of `\"Platform\"` or `\"Sales\"`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nconst Role = enum {\n SE, DPE, DE, DA, PM, PO, KS\n};\n\npub fn main() !void {\n var area: []const u8 = undefined;\n const role = Role.SE;\n switch (role) {\n .PM, .SE, .DPE, .PO => {\n area = \"Platform\";\n },\n .DE, .DA => {\n area = \"Data & Analytics\";\n },\n .KS => {\n area = \"Sales\";\n },\n }\n try stdout.print(\"{s}\\n\", .{area});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nPlatform\n```\n\n\n:::\n:::\n\n\n\n\n\n#### Switch statements must exhaust all possibilities\n\nOne very important aspect about switch statements in Zig\nis that they must exhaust all existing possibilities.\nIn other words, all possible values that could be found inside the `order`\nobject must be explicitly handled in this switch statement.\n\nSince the `role` object have type `Role`, the only possible values to\nbe found inside this object are `PM`, `SE`, `DPE`, `PO`, `DE`, `DA` and `KS`.\nThere are no other possible values to be stored in this `role` object.\nThus, the switch statements must have a combination (branch) for each one of these values.\nThis is what \"exhaust all existing possibilities\" means. The switch statement covers\nevery possible case.\n\nTherefore, you cannot write a switch statement in Zig, and leave an edge case\nwith no explicit action to be taken.\nThis is a similar behaviour to switch statements in Rust, which also have to\nhandle all possible cases.\n\n\n\n#### The else branch\n\nTake a look at the `dump_hex_fallible()` function below as an example. This function\ncomes from the Zig Standard Library. More precisely, from the\n[`debug.zig` module](https://github.com/ziglang/zig/blob/master/lib/std/debug.zig)[^debug-mod].\nThere are multiple lines in this function, but I omitted them to focus solely on the\nswitch statement found in this function. Notice that this switch statement has four\npossible cases (i.e. four explicit branches). Also, notice that we used an `else` branch\nin this case.\n\nAn `else` branch in a switch statement works as the \"default branch\".\nWhenever you have multiple cases in your switch statement where\nyou want to apply the exact same action, you can use an `else` branch to do that.\n\n[^debug-mod]: \n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn dump_hex_fallible(bytes: []const u8) !void {\n // Many lines ...\n switch (byte) {\n '\\n' => try writer.writeAll(\"␊\"),\n '\\r' => try writer.writeAll(\"␍\"),\n '\\t' => try writer.writeAll(\"␉\"),\n else => try writer.writeByte('.'),\n }\n}\n```\n:::\n\n\n\n\nMany programmers would also use an `else` branch to handle a \"not supported\" case.\nThat is, a case that cannot be properly handled by your code, or, just a case that\nshould not be \"fixed\". Therefore, you can use an `else` branch to panic (or raise an error)\nin your program to stop the current execution.\n\nTake the code example below. We can see that, we are handling the cases\nfor the `level` object being either 1, 2, or 3. All other possible cases are not supported by default,\nand, as consequence, we raise a runtime error in such cases through the `@panic()` built-in function.\n\nAlso notice that, we are assigning the result of the switch statement to a new object called `category`.\nThis is another thing that you can do with switch statements in Zig. If a branch\noutputs a value as result, you can store the result value of the switch statement into\na new object.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst level: u8 = 4;\nconst category = switch (level) {\n 1, 2 => \"beginner\",\n 3 => \"professional\",\n else => {\n @panic(\"Not supported level!\");\n },\n};\ntry stdout.print(\"{s}\\n\", .{category});\n```\n:::\n\n\n\n\n```\nthread 13103 panic: Not supported level!\nt.zig:9:13: 0x1033c58 in main (switch2)\n @panic(\"Not supported level!\");\n ^\n```\n\n\n\n#### Using ranges in switch\n\nFurthermore, you can also use ranges of values in switch statements.\nThat is, you can create a branch in your switch statement that is used\nwhenever the input value is within the specified range. These \"range expressions\"\nare created with the operator `...`. It is important\nto emphasize that the ranges created by this operator are\ninclusive on both ends.\n\nFor example, I could easily change the previous code example to support all\nlevels between 0 and 100. Like this:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst level: u8 = 4;\nconst category = switch (level) {\n 0...25 => \"beginner\",\n 26...75 => \"intermediary\",\n 76...100 => \"professional\",\n else => {\n @panic(\"Not supported level!\");\n },\n};\ntry stdout.print(\"{s}\\n\", .{category});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nbeginner\n```\n\n\n:::\n:::\n\n\n\n\nThis is neat, and it works with character ranges too. That is, I could\nsimply write `'a'...'z'`, to match any character value that is a\nlowercase letter, and it would work fine.\n\n\n#### Labeled switch statements\n\nIn @sec-blocks we have talked about labeling blocks, and also, about using these labels\nto return a value from the block. Well, from version 0.14.0 and onwards of the `zig` compiler,\nyou can also apply labels over switch statements, which makes it possible to almost implement a\n\"C `goto`\" like pattern.\n\nFor example, if you give the label `xsw` to a switch statement, you can use this\nlabel in conjunction with the `continue` keyword to go back to the beginning of the switch\nstatement. In the example below, the execution goes back to the beginning of the\nswitch statement two times, before ending at the `3` branch.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nxsw: switch (@as(u8, 1)) {\n 1 => {\n try stdout.print(\"First branch\\n\", .{});\n continue :xsw 2;\n },\n 2 => continue :xsw 3,\n 3 => return,\n 4 => {},\n}\n```\n:::\n\n\n\n\n\n### The `defer` keyword {#sec-defer}\n\nWith the `defer` keyword you can register an expression to be executed when you exit the current scope.\nTherefore, this keyword has a similar functionality as the `on.exit()` function from R.\nTake the `foo()` function below as an example. When we execute this `foo()` function, the expression\nthat prints the message \"Exiting function ...\" is getting executed only when the function exits\nits scope.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nfn foo() !void {\n defer std.debug.print(\n \"Exiting function ...\\n\", .{}\n );\n try stdout.print(\"Adding some numbers ...\\n\", .{});\n const x = 2 + 2; _ = x;\n try stdout.print(\"Multiplying ...\\n\", .{});\n const y = 2 * 8; _ = y;\n}\n\npub fn main() !void {\n try foo();\n}\n```\n:::\n\n\n\n\n```\nAdding some numbers ...\nMultiplying ...\nExiting function ...\n```\n\nTherefore, we can use `defer` to declare an expression that is going to be executed\nwhen your code exits the current scope. Some programmers like to interpret the phrase \"exit of the current scope\"\nas \"the end of the current scope\". But this interpretation might not be entirely correct, depending\non what you consider as \"the end of the current scope\".\n\nI mean, what do you consider as **the end** of the current scope? Is it the closing curly bracket (`}`) of the scope?\nIs it when the last expression in the function get's executed? Is it when the function returns to the previous scope?\nEtc. For example, it would not be correct to interpret the \"exit of the current scope\" as the closing\ncurly bracket of the scope. Because the function might exit from an earlier position than this\nclosing curly bracket (e.g. an error value was generated at a previous line inside the function;\nthe function reached an earlier return statement; etc.). Anyway, just be careful with this interpretation.\n\nNow, if you remember of what we have discussed in @sec-blocks, there are multiple structures in the language\nthat create their own separate scopes. For/while loops, if/else statements,\nfunctions, normal blocks, etc. This also affects the interpretation of `defer`.\nFor example, if you use `defer` inside a for loop, then, the given expression\nwill be executed everytime this specific for loop exits its own scope.\n\nBefore we continue, is worth emphasizing that the `defer` keyword is an \"unconditional defer\".\nWhich means that the given expression will be executed no matter how the code exits\nthe current scope. For example, your code might exit the current scope because of an error value\nbeing generated, or, because of a return statement, or, a break statement, etc.\n\n\n\n### The `errdefer` keyword {#sec-errdefer1}\n\nOn the previous section, we have discussed the `defer` keyword, which you can use to\nregister an expression to be executed at the exit of the current scope.\nBut this keyword have a brother, which is the `errdefer` keyword. While `defer`\nis an \"unconditional defer\", the `errdefer` keyword is a \"conditional defer\".\nWhich means that the given expression is executed only when you exit the current\nscope on a very specific circumstance.\n\nIn more details, the expression given to `errdefer` is executed only when an error occurs in the current scope.\nTherefore, if the function (or for/while loop, if/else statement, etc.) exits the current scope\nin a normal situation, without errors, the expression given to `errdefer` is not executed.\n\nThis makes the `errdefer` keyword one of the many tools available in Zig for error handling.\nIn this section, we are more concerned with the control flow aspects around `errdefer`.\nBut we are going to discuss `errdefer` later as a error handling tool in @sec-errdefer2.\n\nThe code example below demonstrates three things:\n\n- that `defer` is an \"unconditional defer\", because the given expression get's executed regardless of how the function `foo()` exits its own scope.\n- that `errdefer` is executed because the function `foo()` returned an error value.\n- that `defer` and `errdefer` expressions are executed in a LIFO (*last in, first out*) order.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nfn foo() !void { return error.FooError; }\npub fn main() !void {\n var i: usize = 1;\n errdefer std.debug.print(\"Value of i: {d}\\n\", .{i});\n defer i = 2;\n try foo();\n}\n```\n:::\n\n\n\n\n```\nValue of i: 2\nerror: FooError\n/t.zig:6:5: 0x1037e48 in foo (defer)\n return error.FooError;\n ^\n```\n\n\nWhen I say that \"defer expressions\" are executed in a LIFO order, what I want to say is that\nthe last `defer` or `errdefer` expressions in the code are the first ones to be executed.\nYou could also interpret this as: \"defer expressions\" are executed from bottom to top, or,\nfrom last to first.\n\nTherefore, if I change the order of the `defer` and `errdefer` expressions, you will notice that\nthe value of `i` that get's printed to the console changes to 1. This doesn't mean that the\n`defer` expression was not executed in this case. This actually means that the `defer` expression\nwas executed only after the `errdefer` expression. The code example below demonstrates this:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nfn foo() !void { return error.FooError; }\npub fn main() !void {\n var i: usize = 1;\n defer i = 2;\n errdefer std.debug.print(\"Value of i: {d}\\n\", .{i});\n try foo();\n}\n```\n:::\n\n\n\n\n```\nValue of i: 1\nerror: FooError\n/t.zig:6:5: 0x1037e48 in foo (defer)\n return error.FooError;\n ^\n```\n\n\n\n\n### For loops\n\nA loop allows you to execute the same lines of code multiple times,\nthus, creating a \"repetition space\" in the execution flow of your program.\nLoops are particularly useful when we want to replicate the same function\n(or the same set of commands) over different inputs.\n\nThere are different types of loops available in Zig. But the most\nessential of them all is probably the *for loop*. A for loop is\nused to apply the same piece of code over the elements of a slice, or, an array.\n\nFor loops in Zig use a syntax that may be unfamiliar to programmers coming from\nother languages. You start with the `for` keyword, then, you\nlist the items that you want to iterate\nover inside a pair of parentheses. Then, inside of a pair of pipes (`|`)\nyou should declare an identifier that will serve as your iterator, or,\nthe \"repetition index of the loop\".\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfor (items) |value| {\n // code to execute\n}\n```\n:::\n\n\n\n\nTherefore, instead of using a `(value in items)` syntax,\nin Zig, for loops use the syntax `(items) |value|`. In the example\nbelow, you can see that we are looping through the items\nof the array stored at the object `name`, and printing to the\nconsole the decimal representation of each character in this array.\n\nIf we wanted, we could also iterate through a slice (or a portion) of\nthe array, instead of iterating through the entire array stored in the `name` object.\nJust use a range selector to select the section you want. For example,\nI could provide the expression `name[0..3]` to the for loop, to iterate\njust through the first 3 elements in the array.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst name = [_]u8{'P','e','d','r','o'};\nfor (name) |char| {\n try stdout.print(\"{d} | \", .{char});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n80 | 101 | 100 | 114 | 111 | \n```\n\n\n:::\n:::\n\n\n\n\nIn the above example we are using the value itself of each\nelement in the array as our iterator. But there are many situations where\nwe need to use an index instead of the actual values of the items.\n\nYou can do that by providing a second set of items to iterate over.\nMore precisely, you provide the range selector `0..` to the for loop. So,\nyes, you can use two different iterators at the same time in a for\nloop in Zig.\n\nBut remember from @sec-assignments that, every object\nyou create in Zig must be used in some way. So if you declare two iterators\nin your for loop, you must use both iterators inside the for loop body.\nBut if you want to use just the index iterator, and not use the \"value iterator\",\nthen, you can discard the value iterator by maching the\nvalue items to the underscore character, like in the example below:\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfor (name, 0..) |_, i| {\n try stdout.print(\"{d} | \", .{i});\n}\n```\n:::\n\n\n\n\n```\n0 | 1 | 2 | 3 | 4 |\n```\n\n\n### While loops\n\nA while loop is created from the `while` keyword. A `for` loop\niterates through the items of an array, but a `while` loop\nwill loop continuously, and infinitely, until a logical test\n(specified by you) becomes false.\n\nYou start with the `while` keyword, then, you define a logical\nexpression inside a pair of parentheses, and the body of the\nloop is provided inside a pair of curly braces, like in the example below:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar i: u8 = 1;\nwhile (i < 5) {\n try stdout.print(\"{d} | \", .{i});\n i += 1;\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n1 | 2 | 3 | 4 | \n```\n\n\n:::\n:::\n\n\n\n\nYou can also specify the increment expression to be used at the beginning of a while loop.\nTo do that, we write the increment expression inside a pair of parentheses after a colon character (`:`).\nThe code example below demonstrates this other pattern.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar i: u8 = 1;\nwhile (i < 5) : (i += 1) {\n try stdout.print(\"{d} | \", .{i});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n1 | 2 | 3 | 4 | \n```\n\n\n:::\n:::\n\n\n\n\n### Using `break` and `continue`\n\nIn Zig, you can explicitly stop the execution of a loop, or, jump to the next iteration of the loop, by using\nthe keywords `break` and `continue`, respectively. The `while` loop presented in the next code example is,\nat first sight, an infinite loop. Because the logical value inside the parenthese will always be equal to `true`.\nBut what makes this `while` loop stop when the `i` object reaches the count\n10? It is the `break` keyword!\n\nInside the while loop, we have an if statement that is constantly checking if the `i` variable\nis equal to 10. Since we are incrementing the value of `i` at each iteration of the\nwhile loop, this `i` object will eventually be equal to 10, and when it is, the if statement\nwill execute the `break` expression, and, as a result, the execution of the while loop is stopped.\n\nNotice the use of the `expect()` function from the Zig Standard Library after the while loop.\nThis `expect()` function is an \"assert\" type of function.\nThis function checks if the logical test provided is equal to true. If so, the function do nothing.\nOtherwise (i.e. the logical test is equal to false), the function raises an assertion error.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar i: usize = 0;\nwhile (true) {\n if (i == 10) {\n break;\n }\n i += 1;\n}\ntry std.testing.expect(i == 10);\ntry stdout.print(\"Everything worked!\", .{});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nEverything worked!\n```\n\n\n:::\n:::\n\n\n\n\nSince this code example was executed successfully by the `zig` compiler,\nwithout raising any errors, we known that, after the execution of the while loop,\nthe `i` object is equal to 10. Because if it wasn't equal to 10, an error would have\nbeen raised by `expect()`.\n\nNow, in the next example, we have a use case for\nthe `continue` keyword. The if statement is constantly\nchecking if the current index is a multiple of 2. If\nit is, we jump to the next iteration of the loop.\nOtherwise, the loop just prints the current index to the console.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [_]u8{1,2,3,4,5,6};\nfor (ns) |i| {\n if ((i % 2) == 0) {\n continue;\n }\n try stdout.print(\"{d} | \", .{i});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n1 | 3 | 5 | \n```\n\n\n:::\n:::\n\n\n\n\n\n\n## Function parameters are immutable {#sec-fun-pars}\n\nWe have already discussed a lot of the syntax behind function declarations in @sec-root-file and @sec-main-file.\nBut I want to emphasize a curious fact about function parameters (a.k.a. function arguments) in Zig.\nIn summary, function parameters are immutable in Zig.\n\nTake the code example below, where we declare a simple function that just tries to add\nsome amount to the input integer, and returns the result back. If you look closely\nat the body of this `add2()` function, you will notice that we try\nto save the result back into the `x` function argument.\n\nIn other words, this function not only use the value that it received through the function argument\n`x`, but it also tries to change the value of this function argument, by assigning the addition result\ninto `x`. However, function arguments in Zig are immutable. You cannot change their values, or, you\ncannot assign values to them inside the body's function.\n\nThis is the reason why, the code example below do not compile successfully. If you try to compile\nthis code example, you will get a compile error message about \"trying to change the value of a\nimmutable (i.e. constant) object\".\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nfn add2(x: u32) u32 {\n x = x + 2;\n return x;\n}\n\npub fn main() !void {\n const y = add2(4);\n std.debug.print(\"{d}\\n\", .{y});\n}\n```\n:::\n\n\n\n\n```\nt.zig:3:5: error: cannot assign to constant\n x = x + 2;\n ^\n```\n\n\n### A free optimization\n\nIf a function argument receives as input an object whose data type is\nany of the primitive types that we have listed in @sec-primitive-data-types,\nthis object is always passed by value to the function. In other words, this object\nis copied into the function stack frame.\n\nHowever, if the input object have a more complex data type, for example, it might\nbe a struct instance, or an array, or an union value, etc., in cases like that, the `zig` compiler\nwill take the liberty of deciding for you which strategy is best. Thus, the `zig` compiler will\npass your object to the function either by value, or by reference. The compiler will always\nchoose the strategy that is faster for you.\nThis optimization that you get for free is possible only because function arguments are\nimmutable in Zig.\n\n\n### How to overcome this barrier\n\nThere are some situations where you might need to change the value of your function argument\ndirectly inside the function's body. This happens more often when we are passing\nC structs as inputs to Zig functions.\n\nIn a situation like this, you can overcome this barrier by using a pointer. In other words,\ninstead of passing a value as input to the argument, you can pass a \"pointer to value\" instead.\nYou can change the value that the pointer points to, by dereferencing it.\n\nTherefore, if we take our previous `add2()` example, we can change the value of the\nfunction argument `x` inside the function's body by marking the `x` argument as a\n\"pointer to a `u32` value\" (i.e. `*u32` data type), instead of a `u32` value.\nBy making it a pointer, we can finally alter the value of this function argument directly inside\nthe body of the `add2()` function. You can see that the code example below compiles successfully.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nfn add2(x: *u32) void {\n const d: u32 = 2;\n x.* = x.* + d;\n}\n\npub fn main() !void {\n var x: u32 = 4;\n add2(&x);\n std.debug.print(\"Result: {d}\\n\", .{x});\n}\n```\n:::\n\n\n\n\n```\nResult: 6\n```\n\n\nEven in this code example above, the `x` argument is still immutable. Which means that the pointer itself is immutable.\nTherefore, you cannot change the memory address that it points to. However, you can dereference the pointer\nto access the value that it points to, and also, to change this value, if you need to.\n\n\n\n\n\n## Structs and OOP {#sec-structs-and-oop}\n\nZig is a language more closely related to C (which is a procedural language),\nthan it is to C++ or Java (which are object-oriented languages). Because of that, you do not\nhave advanced OOP (Object-Oriented Programming) patterns available in Zig, such as classes, interfaces or\nclass inheritance. Nonetheless, OOP in Zig is still possible by using struct definitions.\n\nWith struct definitions, you can create (or define) a new data type in Zig. These struct definitions work the same way as they work in C.\nYou give a name to this new struct (or, to this new data type you are creating), then, you list the data members of this new struct. You can\nalso register functions inside this struct, and they become the methods of this particular struct (or data type), so that, every object\nthat you create with this new type, will always have these methods available and associated with them.\n\nIn C++, when we create a new class, we normally have a constructor method (or, a constructor function) which\nis used to construct (or, to instantiate) every object of this particular class, and we also have\na destructor method (or a destructor function), which is the function responsible for destroying\nevery object of this class.\n\nIn Zig, we normally declare the constructor and the destructor methods\nof our structs, by declaring an `init()` and a `deinit()` methods inside the struct.\nThis is just a naming convention that you will find across the entire Zig Standard Library.\nSo, in Zig, the `init()` method of a struct is normally the constructor method of the class represented by this struct.\nWhile the `deinit()` method is the method used for destroying an existing instance of that struct.\n\nThe `init()` and `deinit()` methods are both used extensively in Zig code, and you will see both of\nthem being used when we talk about allocators in @sec-allocators.\nBut, as another example, let's build a simple `User` struct to represent an user of some sort of system.\n\nIf you look at the `User` struct below, you can see the `struct` keyword.\nNotice the data members of this struct, `id`, `name` and `email`. Every data member have its\ntype explicitly annotated, with the colon character (`:`) syntax that we described earlier in @sec-root-file.\nBut also notice that every line in the struct body that describes a data member, ends with a comma character (`,`).\nSo every time you declare a data member in your Zig code, always end the line with a comma character, instead\nof ending it with the traditional semicolon character (`;`).\n\nNext, we have registered an `init()` function as a method\nof this `User` struct. This `init()` method is the constructor method that we will use to instantiate\nevery new `User` object. That is why this `init()` function returns a new `User` object as result.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nconst User = struct {\n id: u64,\n name: []const u8,\n email: []const u8,\n\n pub fn init(id: u64,\n name: []const u8,\n email: []const u8) User {\n\n return User {\n .id = id,\n .name = name,\n .email = email\n };\n }\n\n pub fn print_name(self: User) !void {\n try stdout.print(\"{s}\\n\", .{self.name});\n }\n};\n\npub fn main() !void {\n const u = User.init(1, \"pedro\", \"email@gmail.com\");\n try u.print_name();\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\npedro\n```\n\n\n:::\n:::\n\n\n\n\nThe `pub` keyword plays an important role in struct declarations, and OOP in Zig.\nEvery method that you declare in your struct that is marked with the keyword `pub`,\nbecomes a public method of this particular struct.\n\nSo every method that you create inside your struct, is, at first, a private method\nof that struct. Meaning that, this method can only be called from within this\nstruct. But, if you mark this method as public, with the keyword `pub`, then,\nyou can call the method directly from an instance of the `User` struct.\n\nIn other words, the functions marked by the keyword `pub`\nare members of the public API of that struct.\nFor example, if I did not marked the `print_name()` method as public,\nthen, I could not execute the line `u.print_name()`. Because I would\nnot be authorized to call this method directly in my code.\n\n\n\n### Anonymous struct literals {#sec-anonymous-struct-literals}\n\nYou can declare a struct object as a literal value. When we do that, we normally specify the\ndata type of this struct literal by writing its data type just before the opening curly brace.\nFor example, I could write a struct literal value of the type `User` that we have defined\nin the previous section like this:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst eu = User {\n .id = 1,\n .name = \"Pedro\",\n .email = \"someemail@gmail.com\"\n};\n_ = eu;\n```\n:::\n\n\n\n\nHowever, in Zig, we can also write an anonymous struct literal. That is, you can write a\nstruct literal, but not specify explicitly the type of this particular struct.\nAn anonymous struct is written by using the syntax `.{}`. So, we essentially\nreplaced the explicit type of the struct literal with a dot character (`.`).\n\nAs we described in @sec-type-inference, when you put a dot before a struct literal,\nthe type of this struct literal is automatically inferred by the `zig` compiler.\nIn essence, the `zig` compiler will look for some hint of what is the type of that struct.\nThis hint can be the type annotation of a function argument,\nor the return type annotation of the function that you are using, or the type annotation\nof an existing object.\nIf the compiler does find such type annotation, it will use this\ntype in your literal struct.\n\nAnonymous structs are very common to be used as inputs to function arguments in Zig.\nOne example that you have seen already constantly, is the `print()`\nfunction from the `stdout` object.\nThis function takes two arguments.\nThe first argument, is a template string, which should\ncontain string format specifiers in it, which tells how the values provided\nin the second argument should be printed into the message.\n\nWhile the second argument is a struct literal that lists the values\nto be printed into the template message specified in the first argument.\nYou normally want to use an anonymous struct literal here, so that, the\n`zig` compiler do the job of specifying the type of this particular\nanonymous struct for you.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\npub fn main() !void {\n const stdout = std.io.getStdOut().writer();\n try stdout.print(\"Hello, {s}!\\n\", .{\"world\"});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHello, world!\n```\n\n\n:::\n:::\n\n\n\n\n\n\n### Struct declarations must be constant\n\nTypes in Zig must be `const` or `comptime` (we are going to talk more about comptime in @sec-comptime).\nWhat this means is that you cannot create a new data type, and mark it as variable with the `var` keyword.\nSo struct declarations are always constant. You cannot declare a new struct type using the `var` keyword.\nIt must be `const`.\n\nIn the `Vec3` example below, this declaration is allowed because I'm using the `const` keyword\nto declare this new data type.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst Vec3 = struct {\n x: f64,\n y: f64,\n z: f64,\n};\n```\n:::\n\n\n\n\n\n### The `self` method argument {#sec-self-arg}\n\nIn every language that have OOP, when we declare a method of some class or struct, we\nusually declare this method as a function that has a `self` argument.\nThis `self` argument is the reference to the object itself from which the method\nis being called from.\n\nIt is not mandatory to use this `self` argument. But why would you not use this `self` argument?\nThere is no reason to not use it. Because the only way to get access to the data stored in the\ndata members of your struct is to access them through this `self` argument.\nIf you don't need to use the data in the data members of your struct inside your method, you very likely don't need\na method. You can just declare this logic as a simple function, outside of your\nstruct declaration.\n\n\nTake the `Vec3` struct below. Inside this `Vec3` struct we declared a method named `distance()`.\nThis method calculates the distance between two `Vec3` objects, by following the distance\nformula in euclidean space. Notice that this `distance()` method takes two `Vec3` objects\nas input, `self` and `other`.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst m = std.math;\nconst Vec3 = struct {\n x: f64,\n y: f64,\n z: f64,\n\n pub fn distance(self: Vec3, other: Vec3) f64 {\n const xd = m.pow(f64, self.x - other.x, 2.0);\n const yd = m.pow(f64, self.y - other.y, 2.0);\n const zd = m.pow(f64, self.z - other.z, 2.0);\n return m.sqrt(xd + yd + zd);\n }\n};\n```\n:::\n\n\n\n\n\nThe `self` argument corresponds to the `Vec3` object from which this `distance()` method\nis being called from. While the `other` is a separate `Vec3` object that is given as input\nto this method. In the example below, the `self` argument corresponds to the object\n`v1`, because the `distance()` method is being called from the `v1` object,\nwhile the `other` argument corresponds to the object `v2`.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst v1 = Vec3 {\n .x = 4.2, .y = 2.4, .z = 0.9\n};\nconst v2 = Vec3 {\n .x = 5.1, .y = 5.6, .z = 1.6\n};\n\nstd.debug.print(\n \"Distance: {d}\\n\",\n .{v1.distance(v2)}\n);\n```\n:::\n\n\n\n\n```\nDistance: 3.3970575502926055\n```\n\n\n\n### About the struct state\n\nSometimes you don't need to care about the state of your struct object. Sometimes, you just need\nto instantiate and use the objects, without altering their state. You can notice that when you have methods\ninside your struct declaration that might use the values that are present in the data members, but they\ndo not alter the values in these data members of the struct in anyway.\n\nThe `Vec3` struct that was presented in @sec-self-arg is an example of that.\nThis struct have a single method named `distance()`, and this method do use the values\npresent in all three data members of the struct (`x`, `y` and `z`). But at the same time,\nthis method do not change the values of these data members in any point.\n\nAs a result of that, when we create `Vec3` objects we usually create them as\nconstant objects, like the `v1` and `v2` objects presented in @sec-self-arg.\nWe can create them as variable objects with the `var` keyword,\nif we want to. But because the methods of this `Vec3` struct do not change\nthe state of the objects in any point, it's unnecessary to mark them\nas variable objects.\n\nBut why? Why am I talking about this here? It's because the `self` argument\nin the methods is affected depending on whether the\nmethods present in a struct change or don't change the state of the object itself.\nMore specifically, when you have a method in a struct that changes the state\nof the object (i.e. change the value of a data member), the `self` argument\nin this method must be annotated in a different manner.\n\nAs I described in @sec-self-arg, the `self` argument in methods of\na struct is the argument that receives as input the object from which the method\nwas called from. We usually annotate this argument in the methods by writing `self`,\nfollowed by the colon character (`:`), and the data type of the struct to which\nthe method belongs to (e.g. `User`, `Vec3`, etc.).\n\nIf we take the `Vec3` struct that we defined in the previous section as an example,\nwe can see in the `distance()` method that this `self` argument is annotated as\n`self: Vec3`. Because the state of the `Vec3` object is never altered by this\nmethod.\n\nBut what if we do have a method that alters the state of the object, by altering the\nvalues of its data members, how should we annotate `self` in this instance? The answer is:\n\"we should annotate `self` as a pointer of `x`, instead of just `x`\".\nIn other words, you should annotate `self` as `self: *x`, instead of annotating it\nas `self: x`.\n\nIf we create a new method inside the `Vec3` object that, for example, expands the\nvector by multiplying its coordinates by a factor of two, then, we need to follow\nthis rule specified in the previous paragraph. The code example below demonstrates\nthis idea:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst m = std.math;\nconst Vec3 = struct {\n x: f64,\n y: f64,\n z: f64,\n\n pub fn distance(self: Vec3, other: Vec3) f64 {\n const xd = m.pow(f64, self.x - other.x, 2.0);\n const yd = m.pow(f64, self.y - other.y, 2.0);\n const zd = m.pow(f64, self.z - other.z, 2.0);\n return m.sqrt(xd + yd + zd);\n }\n\n pub fn twice(self: *Vec3) void {\n self.x = self.x * 2.0;\n self.y = self.y * 2.0;\n self.z = self.z * 2.0;\n }\n};\n```\n:::\n\n\n\n\nNotice in the code example above that we have added a new method\nto our `Vec3` struct named `twice()`. This method doubles the\ncoordinate values of our vector object. In the\ncase of the `twice()` method, we annotated the `self` argument as `*Vec3`,\nindicating that this argument receives a pointer (or a reference, if you prefer to call it this way)\nto a `Vec3` object as input.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar v3 = Vec3 {\n .x = 4.2, .y = 2.4, .z = 0.9\n};\nv3.twice();\nstd.debug.print(\"Doubled: {d}\\n\", .{v3.x});\n```\n:::\n\n\n\n\n```\nDoubled: 8.4\n```\n\n\n\nNow, if you change the `self` argument in this `twice()` method to `self: Vec3`, like in the\n`distance()` method, you will get the compiler error exposed below as result. Notice that this\nerror message is showing a line from the `twice()` method body,\nindicating that you cannot alter the value of the `x` data member.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// If we change the function signature of double to:\n pub fn twice(self: Vec3) void {\n```\n:::\n\n\n\n\n```\nt.zig:16:13: error: cannot assign to constant\n self.x = self.x * 2.0;\n ~~~~^~\n```\n\nThis error message indicates that the `x` data member belongs to a constant object,\nand, because of that, it cannot be changed. Ultimately, this error message\nis telling us that the `self` argument is constant.\n\nIf you take some time, and think hard about this error message, you will understand it.\nYou already have the tools to understand why we are getting this error message.\nWe have talked about it already in @sec-fun-pars.\nSo remember, every function argument is immutable in Zig, and `self`\nis no exception to this rule.\n\nIn this example, we marked the `v3` object as a variable object.\nBut this does not matter. Because it is not about the input object, it is about\nthe function argument.\n\nThe problem begins when we try to alter the value of `self` directly, which is a function argument,\nand, every function argument is immutable by default. You may ask yourself how can we overcome\nthis barrier, and once again, the solution was also discussed in @sec-fun-pars.\nWe overcome this barrier, by explicitly marking the `self` argument as a pointer.\n\n\n::: {.callout-note}\nIf a method of your `x` struct alters the state of the object, by\nchanging the value of any data member, then, remember to use `self: *x`,\ninstead of `self: x` in the function signature of this method.\n:::\n\n\nYou could also interpret the content discussed in this section as:\n\"if you need to alter the state of your `x` struct object in one of its methods,\nyou must explicitly pass the `x` struct object by reference to the `self` argument of this method\".\n\n\n\n## Type inference {#sec-type-inference}\n\nZig is a strongly typed language. But, there are some situations\nwhere you don't have to explicitly write the type of every single object in your source code,\nas you would expect from a traditional strongly typed language, such as C and C++.\n\nIn some situations, the `zig` compiler can use type inference to solves the data types for you, easing some of\nthe burden that you carry as a developer.\nThe most common way this happens is through function arguments that receives struct objects\nas input.\n\nIn general, type inference in Zig is done by using the dot character (`.`).\nEverytime you see a dot character written before a struct literal, or before an enum value, or something like that,\nyou know that this dot character is playing a special party in this place. More specifically, it is\ntelling the `zig` compiler something along the lines of: \"Hey! Can you infer the type of this\nvalue for me? Please!\". In other words, this dot character is playing a similar role as the `auto` keyword in C++.\n\nI gave you some examples of this in @sec-anonymous-struct-literals, where we used anonymous struct literals.\nAnonymous struct literals are, struct literals that use type inference to\ninfer the exact type of this particular struct literal.\nThis type inference is done by looking for some minimal hint of the correct data type to be used.\nYou could say that the `zig` compiler looks for any neighbouring type annotation that might tell him\nwhat would be the correct type.\n\nAnother common place where we use type inference in Zig is at switch statements (which we talked about in @sec-switch).\nI also gave some other examples of type inference in @sec-switch, where we were inferring the data types of enum values listed inside\nof switch statements (e.g. `.DE`).\nBut as another example, take a look at this `fence()` function reproduced below,\nwhich comes from the [`atomic.zig` module](https://github.com/ziglang/zig/blob/master/lib/std/atomic.zig)[^fence-fn]\nof the Zig Standard Library.\n\n[^fence-fn]: .\n\nThere are a lot of things in this function that we haven't talked about yet, such as:\nwhat `comptime` means? `inline`? `extern`?\nLet's just ignore all of these things, and focus solely on the switch statement\nthat is inside this function.\n\nWe can see that this switch statement uses the `order` object as input. This `order`\nobject is one of the inputs of this `fence()` function, and we can see in the type annotation,\nthat this object is of type `AtomicOrder`. We can also see a bunch of values inside the\nswitch statements that begin with a dot character, such as `.release` and `.acquire`.\n\nBecause these weird values contain a dot character before them, we are asking the `zig`\ncompiler to infer the types of these values inside the switch statement. Then, the `zig`\ncompiler is looking into the current context where these values are being used, and it is\ntrying to infer the types of these values.\n\nSince they are being used inside a switch statement, the `zig` compiler looks into the type\nof the input object given to the switch statement, which is the `order` object in this case.\nBecause this object have type `AtomicOrder`, the `zig` compiler infers that these values\nare data members from this type `AtomicOrder`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub inline fn fence(self: *Self, comptime order: AtomicOrder) void {\n // many lines of code ...\n if (builtin.sanitize_thread) {\n const tsan = struct {\n extern \"c\" fn __tsan_acquire(addr: *anyopaque) void;\n extern \"c\" fn __tsan_release(addr: *anyopaque) void;\n };\n\n const addr: *anyopaque = self;\n return switch (order) {\n .unordered, .monotonic => @compileError(\n @tagName(order)\n ++ \" only applies to atomic loads and stores\"\n ),\n .acquire => tsan.__tsan_acquire(addr),\n .release => tsan.__tsan_release(addr),\n .acq_rel, .seq_cst => {\n tsan.__tsan_acquire(addr);\n tsan.__tsan_release(addr);\n },\n };\n }\n\n return @fence(order);\n}\n```\n:::\n\n\n\n\nThis is how basic type inference is done in Zig. If we didn't use the dot character before\nthe values inside this switch statement, then, we would be forced to write explicitly\nthe data types of these values. For example, instead of writing `.release` we would have to\nwrite `AtomicOrder.release`. We would have to do this for every single value\nin this switch statement, and this is a lot of work. That is why type inference\nis commonly used on switch statements in Zig.\n\n\n\n## Type casting {#sec-type-cast}\n\nIn this section, I want to discuss type casting (or, type conversion) with you.\nWe use type casting when we have an object of type \"x\", and we want to convert\nit into an object of type \"y\", i.e. we want to change the data type of the object.\n\nMost languages have a formal way to perform type casting. In Rust for example, we normally\nuse the keyword `as`, and in C, we normally use the type casting syntax, e.g. `(int) x`.\nIn Zig, we use the `@as()` built-in function to cast an object of type \"x\", into\nan object of type \"y\".\n\nThis `@as()` function is the preferred way to perform type conversion (or type casting)\nin Zig. Because it is explicit, and, it also performs the casting only if it\nis unambiguous and safe. To use this function, you just provide the target data type\nin the first argument, and, the object that you want cast at the second argument.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst expect = std.testing.expect;\ntest {\n const x: usize = 500;\n const y = @as(u32, x);\n try expect(@TypeOf(y) == u32);\n}\n```\n:::\n\n\n\n\nThis is the general way to perform type casting in Zig. But remember, `@as()` works only when casting\nis unambiguous and safe, and there are situations where these assumptions do not hold. For example,\nwhen casting an integer value into a float value, or vice-versa, it is not clear to the compiler\nhow to perform this conversion safely.\n\nTherefore, we need to use specialized \"casting functions\" in such situations.\nFor example, if you want to cast an integer value into a float value, then, you\nshould use the `@floatFromInt()` function. In the inverse scenario, you should use\nthe `@intFromFloat()` function.\n\nIn these functions, you just provide the object that you want to\ncast as input. Then, the target data type of the \"type casting operation\" is determined by\nthe type annotation of the object where you are saving the results.\nIn the example below, we are casting the object `x` into a value of type `f32`,\nbecause the object `y`, which is where we are saving the results, is annotated\nas an object of type `f32`.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst expect = std.testing.expect;\ntest {\n const x: usize = 565;\n const y: f32 = @floatFromInt(x);\n try expect(@TypeOf(y) == f32);\n}\n```\n:::\n\n\n\n\nAnother built-in function that is very useful when performing type casting operations is `@ptrCast()`.\nIn essence, we use the `@as()` built-in function when we want to explicit convert (or cast) a Zig value/object\nfrom a type \"x\" to a type \"y\", etc. However, pointers (we are going to discuss pointers\nin more depth in @sec-pointer) are a special type of object in Zig,\ni.e. they are treated differently from \"normal objects\".\n\nEverytime a pointer is involved in some \"type casting operation\" in Zig, the `@ptrCast()` function is used.\nThis function works similarly to `@floatFromInt()`.\nYou just provide the pointer object that you want to cast as input to this function, and the\ntarget data type is, once again, determined by the type annotation of the object where the results are being\nstored.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst expect = std.testing.expect;\ntest {\n const bytes align(@alignOf(u32)) = [_]u8{\n 0x12, 0x12, 0x12, 0x12\n };\n const u32_ptr: *const u32 = @ptrCast(&bytes);\n try expect(@TypeOf(u32_ptr) == *const u32);\n}\n```\n:::\n\n\n\n\n\n\n\n\n## Modules\n\nWe already talked about what modules are, and also, how to import other modules into\nyour current module via *import statements*. Every Zig module (i.e. a `.zig` file) that you write in your project\nis internally stored as a struct object. Take the line exposed below as an example. In this line we are importing the\nZig Standard Library into our current module.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\n```\n:::\n\n\n\n\nWhen we want to access the functions and objects from the standard library, we\nare basically accessing the data members of the struct stored in the `std`\nobject. That is why we use the same syntax that we use in normal structs, with the dot operator (`.`)\nto access the data members and methods of the struct.\n\nWhen this \"import statement\" get's executed, the result of this expression is a struct\nobject that contains the Zig Standard Library modules, global variables, functions, etc.\nAnd this struct object get's saved (or stored) inside the constant object named `std`.\n\n\nTake the [`thread_pool.zig` module from the project `zap`](https://github.com/kprotty/zap/blob/blog/src/thread_pool.zig)[^thread]\nas an example. This module is written as if it was\na big struct. That is why we have a top-level and public `init()` method\nwritten in this module. The idea is that all top-level functions written in this\nmodule are methods from the struct, and all top-level objects and struct declarations\nare data members of this struct. The module is the struct itself.\n\n[^thread]: \n\n\nSo you would import and use this module by doing something like this:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst ThreadPool = @import(\"thread_pool.zig\");\nconst num_cpus = std.Thread.getCpuCount()\n catch @panic(\"failed to get cpu core count\");\nconst num_threads = std.math.cast(u16, num_cpus)\n catch std.math.maxInt(u16);\nconst pool = ThreadPool.init(\n .{ .max_threads = num_threads }\n);\n```\n:::\n", "supporting": [ "03-structs_files" ], diff --git a/_freeze/Chapters/10-stack-project/execute-results/html.json b/_freeze/Chapters/10-stack-project/execute-results/html.json index e3efae8..96a56f1 100644 --- a/_freeze/Chapters/10-stack-project/execute-results/html.json +++ b/_freeze/Chapters/10-stack-project/execute-results/html.json @@ -1,8 +1,8 @@ { - "hash": "ba248634b19a681ee70adeaf7534e1bc", + "hash": "c396809996f7efa76c7acc37f14f8b73", "result": { "engine": "knitr", - "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n\n\n\n\n# Project 3 - Building a stack data structure\n\nIn this chapter we are going to implement a stack data structure as our next small project\nin this book. Implementing basic data structures in any language is kind of a\n\"kindergarten task\" (if this term even exist) in computer science (CS), because\nwe normally learn and implement them in the first semesters of CS.\n\nBut this is actually good! Since this should be a very easy task, we don't need much to explain\nwhat a stack is, then, we can concentrate on what is really important here, which is learning\nhow the concept of \"generics\" is implemented in the Zig language, and how one of the key\nfeatures of Zig, which is comptime, works, and use the stack data structure to demonstrate\nthese concepts on the fly.\n\nBut before we get into building the stack data structure, we first need to understand\nwhat the `comptime` keyword does to your code, and after that, we also need to learn about\nhow generics work in Zig.\n\n\n## Understanding `comptime` in Zig {#sec-comptime}\n\nOne of the key features of Zig is `comptime`. This keyword introduces a whole\nnew concept and paradigm, that is tightly connected with the compilation process.\nAt @sec-compile-time we have described the importance and the role that \"compile-time versus runtime\"\nplays into Zig. At that section, we learned that the rules applied to a value/object change\na lot depending on whether this value is known at compile-time, or just at runtime.\n\nThe `comptime` keyword is strongly related to these two spaces in time (compile-time and runtime).\nLet's quickly recap the differences. Compile-time is the period of time when your\nZig source code is being compiled by the `zig` compiler, while the runtime is\nthe period of time when your Zig program is being executed, i.e. when we execute\nthe binary files that were generated by the `zig` compiler.\n\nThere are three ways in which you can apply the `comptime` keyword, which are:\n\n- apply `comptime` on a function argument.\n- apply `comptime` on an object.\n- apply `comptime` on a block of expressions.\n\n\n\n### Applying over a function argument\n\nWhen you apply the `comptime` keyword on a function argument, you are saying to the `zig` compiler\nthat the value assigned to that particular function argument must be known at compile-time.\nWe explained in details at @sec-compile-time what exactly \"value known at compile-time\" means, so,\nin case you have doubts about this idea, comeback to that section.\n\nNow let's think about the consequences of this idea. First of all, we are imposing a limit, or, a requirement\nto that particular function argument. If the programmer accidentally tries to give a value to this\nfunction argument that is not known at compile time, the `zig` compiler will notice this problem, and\nas a consequence, it will raise a compilation error saying that it cannot compile your program. Because\nyou are providing a value that is \"runtime known\" to a function argument that must be \"compile-time known\".\n\nTake a look at this very simple example below, where we define a `double()` function, that simply\ndoubles the input value named `num`. Notice that we use the `comptime` keyword before the name\nof the function argument. This keyword is marking the function argument `num` as a \"comptime argument\".\n\nThat is a function argument whose value must be compile-time known. This is why the expression\n`double(5678)` is valid, and no compilation errors are raised. Because the value `5678`\nis compile-time known, so this is the expected behaviour for this function.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfn double(comptime num: u32) u32 {\n return num * 2;\n}\ntest \"test comptime\" {\n _ = double(5678);\n}\n```\n:::\n\n\n\n\nBut what if we provide a number that is not compile-time known to this function?\nFor example, we might provide a different input value to this function depending\non the target OS of our compilation process. The code example below demonstrates such case.\n\nBecause the value of the object `n` is determined at runtime, we cannot provide this object\nas input to the `double()` function. The `zig` compiler will not allow it, because we marked\nthe `num` argument as a \"comptime argument\". That is why the `zig` compiler raises\nthe compile-time error exposed below:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst builtin = @import(\"builtin\");\nfn double(comptime num: u32) u32 {\n return num * 2;\n}\ntest \"test comptime\" {\n var n: u32 = undefined;\n if (builtin.target.os.tag == .windows) {\n n = 1234;\n } else {\n n = 5678;\n }\n _ = double(n);\n}\n```\n:::\n\n\n\n\n```\nt.zig:12:16: error: runtime-known argument passed to comptime parameter \n```\n\nComptime arguments are frequently used on functions that return some sort\nof generic structure. In fact, `comptime` is the essence (or the basis) to make generics in Zig.\nWe are going to talk more about generics at @sec-generics.\n\nFor now, let's take a look at this code example from @karlseguin_generics. You\ncan see that this `IntArray()` function have one argument named `length`.\nThis argument is marked as comptime, and receives a value of type `usize` as input. So the value given to this argument\nmust be compile-time known.\nWe can also see that this function returns an array of `i64` values as output.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfn IntArray(comptime length: usize) type {\n return [length]i64;\n}\n```\n:::\n\n\n\n\nNow, the key component of this function is the `length` argument. This argument\nis used to determine the size of the array that is produced by the function. Let's\nthink about the consequences of that. If the size of the array is dependent on\nthe value assigned to the `length` argument, this means that the data type of the\noutput of the function depends on the value of this `length` argument.\n\nLet this statement sink for a bit in your mind. As I described at @sec-root-file,\nZig is a strongly-typed language, especially on function declarations.\nSo every time we write a function in Zig, we have to annotate the data type of\nthe value returned by the function. But how can we do that, if this data type\ndepends on the value given to the argument of the function?\n\nThink about this for a second. If `length` is equal to 3 for example, then, the\nreturn type of the function is `[3]i64`. But if `length` is equal to 40, then,\nthe return type becomes `[40]i64`. At this point the `zig` compiler would be confused,\nand raise a compilation error, saying something like this:\n\n> Hey! You have annotated that this function should return a `[3]i64` value, but I got a `[40]i64` value instead! This doesn't look right!\n\nSo how can you solve this problem? How do we overcome this barrier? This is when\nthe `type` keyword comes in. This `type` keyword is basically saying to the\n`zig` compiler that this function will return some data type as output, but it doesn't know yet\nwhat exactly data type that is. We will talk more about this at @sec-generics.\n\n\n\n### Applying over an expression\n\nWhen you apply the `comptime` keyword over an expression, then, it is guaranteed that the `zig` compiler will\nexecute this expression at compile-time. If for some reason, this expression cannot be executed at compile-time\n(e.g. for example, maybe this expression depends on a value that is only known at runtime), then, the `zig` compiler\nwill raise a compilation error.\n\nTake this example from the official documentation of Zig [@zigdocs]. We\nare executing the same `fibonacci()` function both at runtime, and, at compile-time.\nThe function is by default executed at runtime, but because we use the `comptime`\nkeyword at the second \"try expression\", this expression is executed at compile-time.\n\nThis might be a bit confusing for some people. Yes! When I say that this expression\nis executed at compile-time, I mean that this expression is compiled and executed\nwhile the `zig` compiler is compiling your Zig source code.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst expect = @import(\"std\").testing.expect;\nfn fibonacci(index: u32) u32 {\n if (index < 2) return index;\n return fibonacci(index - 1) + fibonacci(index - 2);\n}\n\ntest \"fibonacci\" {\n // test fibonacci at run-time\n try expect(fibonacci(7) == 13);\n // test fibonacci at compile-time\n try comptime expect(fibonacci(7) == 13);\n}\n```\n:::\n\n\n\n\nA lot of your Zig source code might be potentially executed at compile-time,\nbecause the `zig` compiler can figure it out the output of some expressions.\nEspecially if these expressions depends only at compile-time known values.\nWe have talked about this at @sec-compile-time.\n\nBut when you use the `comptime` keyword on an expression, there is no \"it might be executed\nat compile-time\" anymore. With the `comptime` keyword you are ordering the `zig` compiler\nto execute this expression at compile-time. You are imposing this rule, it is guaranteed\nthat the compiler will always execute it at compile-time. Or, at least, the compiler\nwill try to execute it. If the compiler cannot execute the expression for whatever reason,\nthe compiler will raise a compilation error.\n\n\n### Applying over a block\n\nBlocks were described at @sec-blocks. When you apply the `comptime` keyword over a\nblock of expressions, you get essentially the same effect when you apply this keyword to\na single expression. That is, the entire block of expressions is executed at\ncompile-time by the `zig` compiler.\n\nIn the example below, we mark the block labeled of `blk` as a comptime block,\nand, therefore, the expressions inside this block are executed at compile-time.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst expect = @import(\"std\").testing.expect;\nfn fibonacci(index: u32) u32 {\n if (index < 2) return index;\n return fibonacci(index - 1) + fibonacci(index - 2);\n}\n\ntest \"fibonacci in a block\" {\n const x = comptime blk: {\n const n1 = 5;\n const n2 = 2;\n const n3 = n1 + n2;\n try expect(fibonacci(n3) == 13);\n break :blk n3;\n };\n _ = x;\n}\n```\n:::\n\n\n\n\n\n\n\n\n## Introducing Generics {#sec-generics}\n\nFirst of all, what is a generic? Generic is the idea to allow a type\n(`f64`, `u8`, `u32`, `bool`, and also, user-defined types, like the `User` struct\nthat we defined at @sec-structs-and-oop) to be a parameter to methods, classes and\ninterfaces [@geeks_generics]. In other words, a \"generic\" is a class (or a method) that can work\nwith multiple data types.\n\nFor example, in Java, generics are created through the operator `<>`. With this operator,\na Java class is capable of receiving a data type as input, and therefore, the class can fit\nits features according to this input data type.\nAs another example, generics in C++ are supported through the concept of templates.\nClass templates in C++ are generics.\n\nIn Zig, generics are implemented through `comptime`. The `comptime` keyword\nallows us to collect a data type at compile time, and pass this data type as\ninput to a piece of code.\n\n\n### A generic function {#sec-generic-fun}\n\nTake the `max()` function exposed below as a first example.\nThis function is essentially a \"generic function\".\nIn this function, we have a comptime function argument named `T`.\nNotice that this `T` argument have a data type of `type`. Weird right? This `type` keyword is the\n\"father of all types\", or, \"the type of types\" in Zig.\n\nBecause we have used this `type` keyword in the `T` argument, we are telling\nthe `zig` compiler that this `T` argument will receive some data type as input.\nAlso notice the use of the `comptime` keyword in this argument.\nAs I described at @sec-comptime, every time you use this keyword in a function argument,\nthis means that the value of this argument must be known at compile-time.\nThis makes sense, right? Because there is no data type that is not known at compile-time.\n\nThink about this. Every data type that you will ever write is always\nknown at compile-time. Especially because data types are an essential\ninformation for the compiler to actually compile your source code.\nHaving this in mind, makes sense to mark this argument as a comptime argument.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfn max(comptime T: type, a: T, b: T) T {\n return if (a > b) a else b;\n}\n```\n:::\n\n\n\n\nAlso notice that the value of the `T` argument is actually used\nto define the data type of the other arguments in the function, `a` and `b`, and also at the\nreturn type annotation of the function.\nThat is, the data type of these arguments (`a` and `b`), and, the return data type of the function itself,\nare determined by the input value given to the `T` argument.\n\nAs a result, we have a generic function that works with different data types.\nFor example, I can provide `u8` values to this `max()` function, and it will work as expected.\nBut if I provide `f64` values instead, it will also work as expected.\nWithout a generic function, I would have to write a different `max()` function\nfor each one of the data types that I wanted to use.\nThis generic function provides a very useful shortcut for us.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nfn max(comptime T: type, a: T, b: T) T {\n return if (a > b) a else b;\n}\ntest \"test max\" {\n const n1 = max(u8, 4, 10);\n std.debug.print(\"Max n1: {d}\\n\", .{n1});\n const n2 = max(f64, 89.24, 64.001);\n std.debug.print(\"Max n2: {d}\\n\", .{n2});\n}\n```\n:::\n\n\n\n\n```\nMax n1: 10\nMax n2: 89.24\n```\n\n\n\n### A generic data structure {#sec-generic-struct}\n\nEvery data structure that you find in the Zig Standard Library (e.g. `ArrayList`, `HashMap`, etc.)\nis essentially a generic data structure.\nThese data structures are generic in the sense that they work with any data type you want.\nYou just say which is the data type of the values that are going to be stored in this data\nstructure, and they just work as expected.\n\nA generic data structure in Zig is how you replicate a generic class from Java,\nor, a class template from C++. But you may quest yourself: how do we build a\ngeneric data structure in Zig?\n\nThe basic idea is to write a generic function that creates the data structure definition\nfor the specific type we want. In other words, this generic function behaves as a \"factory of data structures\".\nThe generic function outputs the `struct` definition that defines this data structure for a\nspecific data type.\n\nTo create such function, we need to add a comptime argument to this function that receives a data type\nas input. We already learned how to do this at the previous section (@sec-generic-fun).\nI think the best way to demonstrate how to create a generic data structure is to actually write one.\nThis where we go into our next small project in this book. This one is a very small project,\nwhich is to write a generic stack data structure.\n\n\n\n\n## What is a stack? {#sec-what-stack}\n\nA stack data structure is a structure that follows a LIFO (*last in, first out*) principle.\nOnly two operations are normally supported in a stack data structure, which are `push` and `pop`.\nThe `push` operation is used to add new values to the stack, while `pop` is used to remove\nvalues from the stack.\n\nWhen people try to explain how the stack data structure works, the most common analogy\nthat they use is a stack of plates. Imagine that you have a stack of plates,\nfor example, a stack of 10 plates in your table. Each plate represents a value that\nis currently stored in this stack.\n\nWe begin with a stack of 10 different values, or 10 different plates. Now, imagine that you want to\nadd a new plate (or a new value) to this stack, which translates to the `push` operation.\nYou would add this plate (or this value) by just putting the new plate\non the top of the stack. Then, you would increase the stack to 11 plates.\n\nBut how would you remove plates (or remove values) from this stack (a.k.a. the `pop` operation) ?\nTo do that, we would have to remove the plate on the top of the stack, and, as a result, we would\nhave, once again, 10 plates in the stack.\n\nThis demonstrates the LIFO concept, because the first plate in the stack, which is the plate\nin the bottom of the stack, is always the last plate to get out of the stack. Think about it. In order\nto remove this specific plate from the stack, we have to remove all plates in the\nstack. So every operation in the stack, either insertion or deletion, is always made at the top of the stack.\nThe @fig-stack below exposes this logic visually:\n\n![A diagram of a stack structure. Source: Wikipedia, the free encyclopedia.](./../Figures/lifo-stack.svg){#fig-stack}\n\n\n\n## Writing the stack data structure\n\nWe are going to write the stack data structure in two steps. First, we are going\nto implement a stack that can only store `u32` values. Then, after that, we are going\nto extend our implementation to make it generic, so that it works with any data type\nwe want.\n\nFirst, we need to decide how the values will be stored inside the stack. There are multiple\nways to implement the storage behind a stack structure. Some people prefer to use a doubly linked list,\nsome others prefer to use a dynamic array, etc. In this example we are going to use an array behind the hood,\nto store the values in the stack, which is the `items` data member of our `Stack` struct definition.\n\nAlso notice in our `Stack` struct that we have three other data members: `capacity`, `length` and `allocator`.\nThe `capacity` member contains the capacity of the underlying array that stores the values in the stack.\nThe `length` contains the number of values that are currently being stored in the stack.\nAnd the `allocator` contains the allocator object that will be used by the stack structure whenever it\nneeds to allocate more space for the values that are being stored.\n\nWe begin by defining an `init()` method of this struct, which is going to be\nresponsible for instantiating a `Stack` object. Notice that, inside this\n`init()` method, we start by allocating an array with the capacity specified\nin the `capacity` argument.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst Allocator = std.mem.Allocator;\nconst Stack = struct {\n items: []u32,\n capacity: usize,\n length: usize,\n allocator: Allocator,\n\n pub fn init(allocator: Allocator, capacity: usize) !Stack {\n var buf = try allocator.alloc(u32, capacity);\n return .{\n .items = buf[0..],\n .capacity = capacity,\n .length = 0,\n .allocator = allocator,\n };\n }\n};\n```\n:::\n\n\n\n\n\n### Implementing the `push` operation\n\nNow that we have written the basic logic to create a new `Stack` object,\nwe can start writing the logic responsible for performing a push operation.\nRemember, a push operation in a stack data structure is the operation\nresponsible for adding a new value to the stack.\n\nSo how can we add a new value to the `Stack` object that we have?\nThe `push()` function exposed below is a possible answer to this question.\nRemember from what we discussed at @sec-what-stack that values are always added to the top of the stack.\nThis means that this `push()` function must always find the element in the underlying array\nthat currently represents the top position of the stack, and then, add the input value there.\n\nFirst, we have an if statement in this function. This if statement is\nchecking whether we need to expand the underlying array to store\nthis new value that we are adding to the stack. In other words, maybe\nthe underlying array does not have enough capacity to store this new\nvalue, and, in this case, we need to expand our array to get the capacity that we need.\n\nSo, if the logical test in this if statement returns true, it means that the array\ndoes not have enough capacity, and we need to expand it before we store this new value.\nSo inside this if statement we are executing the necessary expressions to expand the underlying array.\nNotice that we use the allocator object to allocate a new array that is twice as bigger\nthan the current array (`self.capacity * 2`).\n\nAfter that, we use a different built-in function named `@memcpy()`. This built-in function\nis equivalent to the `memcpy()` function from the C Standard Library[^cmemcpy]. It is used to\ncopy the values from one block of memory to another block of memory. In other words,\nyou can use this function to copy the values from one array into another array.\n\n[^cmemcpy]: \n\nWe are using this `@memcpy()` built-in function to copy the values that are currently stored\nin the underlying array of the stack object (`self.items`) into our new and bigger array that\nwe have allocated (`new_buf`). After we execute this function, the `new_buf` contains a copy\nof the values that are present at `self.items`.\n\nNow that we have secured a copy of our current values in the `new_buf` object, we\ncan now free the memory currently allocated at `self.items`. After that, we just need\nto assign our new and bigger array to `self.items`. This is the sequence\nof steps necessary to expand our array.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn push(self: *Stack, val: u32) !void {\n if ((self.length + 1) > self.capacity) {\n var new_buf = try self.allocator.alloc(\n u32, self.capacity * 2\n );\n @memcpy(\n new_buf[0..self.capacity], self.items\n );\n self.allocator.free(self.items);\n self.items = new_buf;\n self.capacity = self.capacity * 2;\n }\n\n self.items[self.length] = val;\n self.length += 1;\n}\n```\n:::\n\n\n\n\nAfter we make sure that we have enough room to store this new value\nthat we are adding to the stack, all we have to do is to assign\nthis value to the top element in this stack, and, increase the\nvalue of the `length` attribute by one. We find the top element\nin the stack by using the `length` attribute.\n\n\n\n### Implementing the `pop` operation\n\nNow we can implement the pop operation of our stack object.\nThis is a much easier operation to implement, and the `pop()` method below summarises\nall the logic that is needed.\n\nWe just have to find the element in the underlying array that currently represents the top\nof the stack, and set this element to \"undefined\", to indicate that\nthis element is \"empty\". After that, we also need to decrease\nthe `length` attribute of the stack by one.\n\nIf the current length of the stack is zero, it means that there is\nno values being stored in the stack currently. So, in this case,\nwe could just return from the function and do nothing really.\nThis is what the if statement inside this function is checking for.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn pop(self: *Stack) void {\n if (self.length == 0) return;\n\n self.items[self.length - 1] = undefined;\n self.length -= 1;\n}\n```\n:::\n\n\n\n\n\n\n### Implementing the `deinit` method\n\nWe have implemented the methods responsible for the two main operations\nassociated with the stack data structure, which is `pop()` and `push()`,\nand we also have implemented the method responsible for instantiating\na new `Stack` object, which is the `init()` method.\n\nBut now, we need to implement also the method responsible for destroying\na `Stack` object. In Zig, this task is commonly associated with the method\nnamed `deinit()`. Most struct objects in Zig have such method, and it\nis commonly nicknamed \"the destructor method\".\n\nIn theory, all we have to do to destroy the `Stack` object is to make\nsure that we free the allocated memory for the underlying array, using\nthe allocator object that is stored inside the `Stack` object.\nThis is what the `deinit()` method below is doing.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn deinit(self: *Stack) void {\n self.allocator.free(self.items);\n}\n```\n:::\n\n\n\n\n\n\n\n## Making it generic\n\nNow that we have implemented the basic skeleton of our stack data structure,\nwe can now focus on discussing how can we make it generic. How can we make\nthis basic skeleton to work not only with `u32` values, but also, with any other\ndata type we want?\nFor example, we might need to create a stack object to store `User` values\nin it. How can we make this possible? The answer lies on the use of generics\nand `comptime`.\n\nAs I described at @sec-generic-struct, the basic idea is to write a generic\nfunction that returns a struct definition as output.\nIn theory, we do not need much to transform our `Stack` struct into a generic\ndata structure. All that we need to do is to transform the underlying array\nof the stack into a generic array.\n\nIn other words, this underlying array needs to be a \"chameleon\". It needs to adapt,\nand transform it into an array of any data type that we want. For example, if we need to create\na stack that will store `u8` values, then, this underlying array needs to be\na `u8` array (i.e. `[]u8`). But if we need to store `User` values instead, then,\nthis array needs to be a `User` array (i.e. `[]User`). Etc.\n\nWe do that by using a generic function. Because a generic function can receive a data type\nas input, and we can pass this data type to the struct definition of our `Stack` object.\nTherefore, we can use the generic function to create a `Stack` object that can store\nthe data type we want. If we want to create a stack structure that stores `User` values,\nwe pass the `User` data type to this generic function, and it will create for us\nthe struct definition that describes a `Stack` object that can store `User` values in it.\n\nLook at the code example below. I have omitted some parts of the `Stack` struct definition\nfor brevity reasons. However, if a specific part of our `Stack` struct is not exposed here\nin this example, then it is because this part did not change from the previous example.\nIt remains the same.\n\n\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfn Stack(comptime T: type) type {\n return struct {\n items: []T,\n capacity: usize,\n length: usize,\n allocator: Allocator,\n const Self = @This();\n\n pub fn init(allocator: Allocator,\n capacity: usize) !Stack(T) {\n var buf = try allocator.alloc(T, capacity);\n return .{\n .items = buf[0..],\n .capacity = capacity,\n .length = 0,\n .allocator = allocator,\n };\n }\n\n pub fn push(self: *Self, val: T) !void {\n // Truncate the rest of the struct\n };\n}\n```\n:::\n\n\n\n\nNotice that we have created a function in this example named `Stack()`. This function\ntakes a type as input, and passes this type to the struct definition of our\n`Stack` object. The data member `items` is now, an array of type `T`, which is the\ndata type that we have provided as input to the function. The function argument\n`val` in the `push()` function is now a value of type `T` too.\n\nWe can just provide a data type to this function, and it will create a definition of a\n`Stack` object that can store values of the data type that we have provided. In the example below, we are creating\nthe definition of a\n`Stack` object that can store `u8` values in it. This definition is stored at the `Stacku8` object.\nThis `Stacku8` object becomes our new struct, it is the struct that we are going to use\nto create our `Stack` object.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nconst allocator = gpa.allocator();\nconst Stacku8 = Stack(u8);\nvar stack = try Stacku8.init(allocator, 10);\ndefer stack.deinit();\ntry stack.push(1);\ntry stack.push(2);\ntry stack.push(3);\ntry stack.push(4);\ntry stack.push(5);\ntry stack.push(6);\n\nstd.debug.print(\"Stack len: {d}\\n\", .{stack.length});\nstd.debug.print(\"Stack capacity: {d}\\n\", .{stack.capacity});\n\nstack.pop();\nstd.debug.print(\"Stack len: {d}\\n\", .{stack.length});\nstack.pop();\nstd.debug.print(\"Stack len: {d}\\n\", .{stack.length});\nstd.debug.print(\n \"Stack state: {any}\\n\",\n .{stack.items[0..stack.length]}\n);\n```\n:::\n\n\n\n\n```\nStack len: 6\nStack capacity: 10\nStack len: 5\nStack len: 4\nStack state: { 1, 2, 3, 4, 0, 0, 0, 0, 0, 0 }\n```\n\nEvery generic data structure in the Zig Standard Library (`ArrayList`, `HashMap`, `SinlyLinkedList`, etc.)\nis implemented through this logic. They use a generic function to create the struct definition that can work\nwith the data type that you provided as input.\n\n\n\n\n## Conclusion\n\nThe full source code of the stack structure discussed in this chapter is freely available at the official\nrepository of this book. Just checkout the [`stack.zig`](https://github.com/pedropark99/zig-book/tree/main/ZigExamples/data-structures/stack.zig)[^zig-stack]\nfor the `u32` version of our stack,\nand the [`generic_stack.zig`](https://github.com/pedropark99/zig-book/tree/main/ZigExamples/data-structures/generic_stack.zig)[^zig-stack2]\nfor the generic version, available inside the `ZigExamples` folder of the repository.\n\n\n[^zig-stack]: \n[^zig-stack2]: \n\n", + "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n\n\n\n\n# Project 3 - Building a stack data structure\n\nIn this chapter we are going to implement a stack data structure as our next small project\nin this book. Implementing basic data structures in any language is kind of a\n\"kindergarten task\" (if this term even exist) in computer science (CS), because\nwe normally learn and implement them in the first semesters of CS.\n\nBut this is actually good! Since this should be a very easy task, we don't need much to explain\nwhat a stack is, then, we can concentrate on what is really important here, which is learning\nhow the concept of \"generics\" is implemented in the Zig language, and how one of the key\nfeatures of Zig, which is comptime, works, and use the stack data structure to demonstrate\nthese concepts on the fly.\n\nBut before we get into building the stack data structure, we first need to understand\nwhat the `comptime` keyword does to your code, and after that, we also need to learn about\nhow generics work in Zig.\n\n\n## Understanding `comptime` in Zig {#sec-comptime}\n\nOne of the key features of Zig is `comptime`. This keyword introduces a whole\nnew concept and paradigm, that is tightly connected with the compilation process.\nAt @sec-compile-time we have described the importance and the role that \"compile-time versus runtime\"\nplays into Zig. At that section, we learned that the rules applied to a value/object change\na lot depending on whether this value is known at compile-time, or just at runtime.\n\nThe `comptime` keyword is strongly related to these two spaces in time (compile-time and runtime).\nLet's quickly recap the differences. Compile-time is the period of time when your\nZig source code is being compiled by the `zig` compiler, while the runtime is\nthe period of time when your Zig program is being executed, i.e. when we execute\nthe binary files that were generated by the `zig` compiler.\n\nThere are three ways in which you can apply the `comptime` keyword, which are:\n\n- apply `comptime` on a function argument.\n- apply `comptime` on an object.\n- apply `comptime` on a block of expressions.\n\n\n\n### Applying over a function argument\n\nWhen you apply the `comptime` keyword on a function argument, you are saying to the `zig` compiler\nthat the value assigned to that particular function argument must be known at compile-time.\nWe explained in details at @sec-compile-time what exactly \"value known at compile-time\" means, so,\nin case you have doubts about this idea, comeback to that section.\n\nNow let's think about the consequences of this idea. First of all, we are imposing a limit, or, a requirement\nto that particular function argument. If the programmer accidentally tries to give a value to this\nfunction argument that is not known at compile time, the `zig` compiler will notice this problem, and\nas a consequence, it will raise a compilation error saying that it cannot compile your program. Because\nyou are providing a value that is \"runtime known\" to a function argument that must be \"compile-time known\".\n\nTake a look at this very simple example below, where we define a `twice()` function, that simply\ndoubles the input value named `num`. Notice that we use the `comptime` keyword before the name\nof the function argument. This keyword is marking the function argument `num` as a \"comptime argument\".\n\nThat is a function argument whose value must be compile-time known. This is why the expression\n`twice(5678)` is valid, and no compilation errors are raised. Because the value `5678`\nis compile-time known, so this is the expected behaviour for this function.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfn twice(comptime num: u32) u32 {\n return num * 2;\n}\ntest \"test comptime\" {\n _ = twice(5678);\n}\n```\n:::\n\n\n\n\nBut what if we provide a number that is not compile-time known to this function?\nFor example, we might provide a different input value to this function depending\non the target OS of our compilation process. The code example below demonstrates such case.\n\nBecause the value of the object `n` is determined at runtime, we cannot provide this object\nas input to the `twice()` function. The `zig` compiler will not allow it, because we marked\nthe `num` argument as a \"comptime argument\". That is why the `zig` compiler raises\nthe compile-time error exposed below:\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst builtin = @import(\"builtin\");\nfn twice(comptime num: u32) u32 {\n return num * 2;\n}\ntest \"test comptime\" {\n var n: u32 = undefined;\n if (builtin.target.os.tag == .windows) {\n n = 1234;\n } else {\n n = 5678;\n }\n _ = twice(n);\n}\n```\n:::\n\n\n\n\n```\nt.zig:12:16: error: runtime-known argument passed to comptime parameter \n```\n\nComptime arguments are frequently used on functions that return some sort\nof generic structure. In fact, `comptime` is the essence (or the basis) to make generics in Zig.\nWe are going to talk more about generics at @sec-generics.\n\nFor now, let's take a look at this code example from @karlseguin_generics. You\ncan see that this `IntArray()` function have one argument named `length`.\nThis argument is marked as comptime, and receives a value of type `usize` as input. So the value given to this argument\nmust be compile-time known.\nWe can also see that this function returns an array of `i64` values as output.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfn IntArray(comptime length: usize) type {\n return [length]i64;\n}\n```\n:::\n\n\n\n\nNow, the key component of this function is the `length` argument. This argument\nis used to determine the size of the array that is produced by the function. Let's\nthink about the consequences of that. If the size of the array is dependent on\nthe value assigned to the `length` argument, this means that the data type of the\noutput of the function depends on the value of this `length` argument.\n\nLet this statement sink for a bit in your mind. As I described at @sec-root-file,\nZig is a strongly-typed language, especially on function declarations.\nSo every time we write a function in Zig, we have to annotate the data type of\nthe value returned by the function. But how can we do that, if this data type\ndepends on the value given to the argument of the function?\n\nThink about this for a second. If `length` is equal to 3 for example, then, the\nreturn type of the function is `[3]i64`. But if `length` is equal to 40, then,\nthe return type becomes `[40]i64`. At this point the `zig` compiler would be confused,\nand raise a compilation error, saying something like this:\n\n> Hey! You have annotated that this function should return a `[3]i64` value, but I got a `[40]i64` value instead! This doesn't look right!\n\nSo how can you solve this problem? How do we overcome this barrier? This is when\nthe `type` keyword comes in. This `type` keyword is basically saying to the\n`zig` compiler that this function will return some data type as output, but it doesn't know yet\nwhat exactly data type that is. We will talk more about this at @sec-generics.\n\n\n\n### Applying over an expression\n\nWhen you apply the `comptime` keyword over an expression, then, it is guaranteed that the `zig` compiler will\nexecute this expression at compile-time. If for some reason, this expression cannot be executed at compile-time\n(e.g. for example, maybe this expression depends on a value that is only known at runtime), then, the `zig` compiler\nwill raise a compilation error.\n\nTake this example from the official documentation of Zig [@zigdocs]. We\nare executing the same `fibonacci()` function both at runtime, and, at compile-time.\nThe function is by default executed at runtime, but because we use the `comptime`\nkeyword at the second \"try expression\", this expression is executed at compile-time.\n\nThis might be a bit confusing for some people. Yes! When I say that this expression\nis executed at compile-time, I mean that this expression is compiled and executed\nwhile the `zig` compiler is compiling your Zig source code.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst expect = @import(\"std\").testing.expect;\nfn fibonacci(index: u32) u32 {\n if (index < 2) return index;\n return fibonacci(index - 1) + fibonacci(index - 2);\n}\n\ntest \"fibonacci\" {\n // test fibonacci at run-time\n try expect(fibonacci(7) == 13);\n // test fibonacci at compile-time\n try comptime expect(fibonacci(7) == 13);\n}\n```\n:::\n\n\n\n\nA lot of your Zig source code might be potentially executed at compile-time,\nbecause the `zig` compiler can figure it out the output of some expressions.\nEspecially if these expressions depends only at compile-time known values.\nWe have talked about this at @sec-compile-time.\n\nBut when you use the `comptime` keyword on an expression, there is no \"it might be executed\nat compile-time\" anymore. With the `comptime` keyword you are ordering the `zig` compiler\nto execute this expression at compile-time. You are imposing this rule, it is guaranteed\nthat the compiler will always execute it at compile-time. Or, at least, the compiler\nwill try to execute it. If the compiler cannot execute the expression for whatever reason,\nthe compiler will raise a compilation error.\n\n\n### Applying over a block\n\nBlocks were described at @sec-blocks. When you apply the `comptime` keyword over a\nblock of expressions, you get essentially the same effect when you apply this keyword to\na single expression. That is, the entire block of expressions is executed at\ncompile-time by the `zig` compiler.\n\nIn the example below, we mark the block labeled of `blk` as a comptime block,\nand, therefore, the expressions inside this block are executed at compile-time.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst expect = @import(\"std\").testing.expect;\nfn fibonacci(index: u32) u32 {\n if (index < 2) return index;\n return fibonacci(index - 1) + fibonacci(index - 2);\n}\n\ntest \"fibonacci in a block\" {\n const x = comptime blk: {\n const n1 = 5;\n const n2 = 2;\n const n3 = n1 + n2;\n try expect(fibonacci(n3) == 13);\n break :blk n3;\n };\n _ = x;\n}\n```\n:::\n\n\n\n\n\n\n\n\n## Introducing Generics {#sec-generics}\n\nFirst of all, what is a generic? Generic is the idea to allow a type\n(`f64`, `u8`, `u32`, `bool`, and also, user-defined types, like the `User` struct\nthat we defined at @sec-structs-and-oop) to be a parameter to methods, classes and\ninterfaces [@geeks_generics]. In other words, a \"generic\" is a class (or a method) that can work\nwith multiple data types.\n\nFor example, in Java, generics are created through the operator `<>`. With this operator,\na Java class is capable of receiving a data type as input, and therefore, the class can fit\nits features according to this input data type.\nAs another example, generics in C++ are supported through the concept of templates.\nClass templates in C++ are generics.\n\nIn Zig, generics are implemented through `comptime`. The `comptime` keyword\nallows us to collect a data type at compile time, and pass this data type as\ninput to a piece of code.\n\n\n### A generic function {#sec-generic-fun}\n\nTake the `max()` function exposed below as a first example.\nThis function is essentially a \"generic function\".\nIn this function, we have a comptime function argument named `T`.\nNotice that this `T` argument have a data type of `type`. Weird right? This `type` keyword is the\n\"father of all types\", or, \"the type of types\" in Zig.\n\nBecause we have used this `type` keyword in the `T` argument, we are telling\nthe `zig` compiler that this `T` argument will receive some data type as input.\nAlso notice the use of the `comptime` keyword in this argument.\nAs I described at @sec-comptime, every time you use this keyword in a function argument,\nthis means that the value of this argument must be known at compile-time.\nThis makes sense, right? Because there is no data type that is not known at compile-time.\n\nThink about this. Every data type that you will ever write is always\nknown at compile-time. Especially because data types are an essential\ninformation for the compiler to actually compile your source code.\nHaving this in mind, makes sense to mark this argument as a comptime argument.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfn max(comptime T: type, a: T, b: T) T {\n return if (a > b) a else b;\n}\n```\n:::\n\n\n\n\nAlso notice that the value of the `T` argument is actually used\nto define the data type of the other arguments in the function, `a` and `b`, and also at the\nreturn type annotation of the function.\nThat is, the data type of these arguments (`a` and `b`), and, the return data type of the function itself,\nare determined by the input value given to the `T` argument.\n\nAs a result, we have a generic function that works with different data types.\nFor example, I can provide `u8` values to this `max()` function, and it will work as expected.\nBut if I provide `f64` values instead, it will also work as expected.\nWithout a generic function, I would have to write a different `max()` function\nfor each one of the data types that I wanted to use.\nThis generic function provides a very useful shortcut for us.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nfn max(comptime T: type, a: T, b: T) T {\n return if (a > b) a else b;\n}\ntest \"test max\" {\n const n1 = max(u8, 4, 10);\n std.debug.print(\"Max n1: {d}\\n\", .{n1});\n const n2 = max(f64, 89.24, 64.001);\n std.debug.print(\"Max n2: {d}\\n\", .{n2});\n}\n```\n:::\n\n\n\n\n```\nMax n1: 10\nMax n2: 89.24\n```\n\n\n\n### A generic data structure {#sec-generic-struct}\n\nEvery data structure that you find in the Zig Standard Library (e.g. `ArrayList`, `HashMap`, etc.)\nis essentially a generic data structure.\nThese data structures are generic in the sense that they work with any data type you want.\nYou just say which is the data type of the values that are going to be stored in this data\nstructure, and they just work as expected.\n\nA generic data structure in Zig is how you replicate a generic class from Java,\nor, a class template from C++. But you may quest yourself: how do we build a\ngeneric data structure in Zig?\n\nThe basic idea is to write a generic function that creates the data structure definition\nfor the specific type we want. In other words, this generic function behaves as a \"factory of data structures\".\nThe generic function outputs the `struct` definition that defines this data structure for a\nspecific data type.\n\nTo create such function, we need to add a comptime argument to this function that receives a data type\nas input. We already learned how to do this at the previous section (@sec-generic-fun).\nI think the best way to demonstrate how to create a generic data structure is to actually write one.\nThis where we go into our next small project in this book. This one is a very small project,\nwhich is to write a generic stack data structure.\n\n\n\n\n## What is a stack? {#sec-what-stack}\n\nA stack data structure is a structure that follows a LIFO (*last in, first out*) principle.\nOnly two operations are normally supported in a stack data structure, which are `push` and `pop`.\nThe `push` operation is used to add new values to the stack, while `pop` is used to remove\nvalues from the stack.\n\nWhen people try to explain how the stack data structure works, the most common analogy\nthat they use is a stack of plates. Imagine that you have a stack of plates,\nfor example, a stack of 10 plates in your table. Each plate represents a value that\nis currently stored in this stack.\n\nWe begin with a stack of 10 different values, or 10 different plates. Now, imagine that you want to\nadd a new plate (or a new value) to this stack, which translates to the `push` operation.\nYou would add this plate (or this value) by just putting the new plate\non the top of the stack. Then, you would increase the stack to 11 plates.\n\nBut how would you remove plates (or remove values) from this stack (a.k.a. the `pop` operation) ?\nTo do that, we would have to remove the plate on the top of the stack, and, as a result, we would\nhave, once again, 10 plates in the stack.\n\nThis demonstrates the LIFO concept, because the first plate in the stack, which is the plate\nin the bottom of the stack, is always the last plate to get out of the stack. Think about it. In order\nto remove this specific plate from the stack, we have to remove all plates in the\nstack. So every operation in the stack, either insertion or deletion, is always made at the top of the stack.\nThe @fig-stack below exposes this logic visually:\n\n![A diagram of a stack structure. Source: Wikipedia, the free encyclopedia.](./../Figures/lifo-stack.svg){#fig-stack}\n\n\n\n## Writing the stack data structure\n\nWe are going to write the stack data structure in two steps. First, we are going\nto implement a stack that can only store `u32` values. Then, after that, we are going\nto extend our implementation to make it generic, so that it works with any data type\nwe want.\n\nFirst, we need to decide how the values will be stored inside the stack. There are multiple\nways to implement the storage behind a stack structure. Some people prefer to use a doubly linked list,\nsome others prefer to use a dynamic array, etc. In this example we are going to use an array behind the hood,\nto store the values in the stack, which is the `items` data member of our `Stack` struct definition.\n\nAlso notice in our `Stack` struct that we have three other data members: `capacity`, `length` and `allocator`.\nThe `capacity` member contains the capacity of the underlying array that stores the values in the stack.\nThe `length` contains the number of values that are currently being stored in the stack.\nAnd the `allocator` contains the allocator object that will be used by the stack structure whenever it\nneeds to allocate more space for the values that are being stored.\n\nWe begin by defining an `init()` method of this struct, which is going to be\nresponsible for instantiating a `Stack` object. Notice that, inside this\n`init()` method, we start by allocating an array with the capacity specified\nin the `capacity` argument.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst Allocator = std.mem.Allocator;\nconst Stack = struct {\n items: []u32,\n capacity: usize,\n length: usize,\n allocator: Allocator,\n\n pub fn init(allocator: Allocator, capacity: usize) !Stack {\n var buf = try allocator.alloc(u32, capacity);\n return .{\n .items = buf[0..],\n .capacity = capacity,\n .length = 0,\n .allocator = allocator,\n };\n }\n};\n```\n:::\n\n\n\n\n\n### Implementing the `push` operation\n\nNow that we have written the basic logic to create a new `Stack` object,\nwe can start writing the logic responsible for performing a push operation.\nRemember, a push operation in a stack data structure is the operation\nresponsible for adding a new value to the stack.\n\nSo how can we add a new value to the `Stack` object that we have?\nThe `push()` function exposed below is a possible answer to this question.\nRemember from what we discussed at @sec-what-stack that values are always added to the top of the stack.\nThis means that this `push()` function must always find the element in the underlying array\nthat currently represents the top position of the stack, and then, add the input value there.\n\nFirst, we have an if statement in this function. This if statement is\nchecking whether we need to expand the underlying array to store\nthis new value that we are adding to the stack. In other words, maybe\nthe underlying array does not have enough capacity to store this new\nvalue, and, in this case, we need to expand our array to get the capacity that we need.\n\nSo, if the logical test in this if statement returns true, it means that the array\ndoes not have enough capacity, and we need to expand it before we store this new value.\nSo inside this if statement we are executing the necessary expressions to expand the underlying array.\nNotice that we use the allocator object to allocate a new array that is twice as bigger\nthan the current array (`self.capacity * 2`).\n\nAfter that, we use a different built-in function named `@memcpy()`. This built-in function\nis equivalent to the `memcpy()` function from the C Standard Library[^cmemcpy]. It is used to\ncopy the values from one block of memory to another block of memory. In other words,\nyou can use this function to copy the values from one array into another array.\n\n[^cmemcpy]: \n\nWe are using this `@memcpy()` built-in function to copy the values that are currently stored\nin the underlying array of the stack object (`self.items`) into our new and bigger array that\nwe have allocated (`new_buf`). After we execute this function, the `new_buf` contains a copy\nof the values that are present at `self.items`.\n\nNow that we have secured a copy of our current values in the `new_buf` object, we\ncan now free the memory currently allocated at `self.items`. After that, we just need\nto assign our new and bigger array to `self.items`. This is the sequence\nof steps necessary to expand our array.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn push(self: *Stack, val: u32) !void {\n if ((self.length + 1) > self.capacity) {\n var new_buf = try self.allocator.alloc(\n u32, self.capacity * 2\n );\n @memcpy(\n new_buf[0..self.capacity], self.items\n );\n self.allocator.free(self.items);\n self.items = new_buf;\n self.capacity = self.capacity * 2;\n }\n\n self.items[self.length] = val;\n self.length += 1;\n}\n```\n:::\n\n\n\n\nAfter we make sure that we have enough room to store this new value\nthat we are adding to the stack, all we have to do is to assign\nthis value to the top element in this stack, and, increase the\nvalue of the `length` attribute by one. We find the top element\nin the stack by using the `length` attribute.\n\n\n\n### Implementing the `pop` operation\n\nNow we can implement the pop operation of our stack object.\nThis is a much easier operation to implement, and the `pop()` method below summarises\nall the logic that is needed.\n\nWe just have to find the element in the underlying array that currently represents the top\nof the stack, and set this element to \"undefined\", to indicate that\nthis element is \"empty\". After that, we also need to decrease\nthe `length` attribute of the stack by one.\n\nIf the current length of the stack is zero, it means that there is\nno values being stored in the stack currently. So, in this case,\nwe could just return from the function and do nothing really.\nThis is what the if statement inside this function is checking for.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn pop(self: *Stack) void {\n if (self.length == 0) return;\n\n self.items[self.length - 1] = undefined;\n self.length -= 1;\n}\n```\n:::\n\n\n\n\n\n\n### Implementing the `deinit` method\n\nWe have implemented the methods responsible for the two main operations\nassociated with the stack data structure, which is `pop()` and `push()`,\nand we also have implemented the method responsible for instantiating\na new `Stack` object, which is the `init()` method.\n\nBut now, we need to implement also the method responsible for destroying\na `Stack` object. In Zig, this task is commonly associated with the method\nnamed `deinit()`. Most struct objects in Zig have such method, and it\nis commonly nicknamed \"the destructor method\".\n\nIn theory, all we have to do to destroy the `Stack` object is to make\nsure that we free the allocated memory for the underlying array, using\nthe allocator object that is stored inside the `Stack` object.\nThis is what the `deinit()` method below is doing.\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn deinit(self: *Stack) void {\n self.allocator.free(self.items);\n}\n```\n:::\n\n\n\n\n\n\n\n## Making it generic\n\nNow that we have implemented the basic skeleton of our stack data structure,\nwe can now focus on discussing how can we make it generic. How can we make\nthis basic skeleton to work not only with `u32` values, but also, with any other\ndata type we want?\nFor example, we might need to create a stack object to store `User` values\nin it. How can we make this possible? The answer lies on the use of generics\nand `comptime`.\n\nAs I described at @sec-generic-struct, the basic idea is to write a generic\nfunction that returns a struct definition as output.\nIn theory, we do not need much to transform our `Stack` struct into a generic\ndata structure. All that we need to do is to transform the underlying array\nof the stack into a generic array.\n\nIn other words, this underlying array needs to be a \"chameleon\". It needs to adapt,\nand transform it into an array of any data type that we want. For example, if we need to create\na stack that will store `u8` values, then, this underlying array needs to be\na `u8` array (i.e. `[]u8`). But if we need to store `User` values instead, then,\nthis array needs to be a `User` array (i.e. `[]User`). Etc.\n\nWe do that by using a generic function. Because a generic function can receive a data type\nas input, and we can pass this data type to the struct definition of our `Stack` object.\nTherefore, we can use the generic function to create a `Stack` object that can store\nthe data type we want. If we want to create a stack structure that stores `User` values,\nwe pass the `User` data type to this generic function, and it will create for us\nthe struct definition that describes a `Stack` object that can store `User` values in it.\n\nLook at the code example below. I have omitted some parts of the `Stack` struct definition\nfor brevity reasons. However, if a specific part of our `Stack` struct is not exposed here\nin this example, then it is because this part did not change from the previous example.\nIt remains the same.\n\n\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfn Stack(comptime T: type) type {\n return struct {\n items: []T,\n capacity: usize,\n length: usize,\n allocator: Allocator,\n const Self = @This();\n\n pub fn init(allocator: Allocator,\n capacity: usize) !Stack(T) {\n var buf = try allocator.alloc(T, capacity);\n return .{\n .items = buf[0..],\n .capacity = capacity,\n .length = 0,\n .allocator = allocator,\n };\n }\n\n pub fn push(self: *Self, val: T) !void {\n // Truncate the rest of the struct\n };\n}\n```\n:::\n\n\n\n\nNotice that we have created a function in this example named `Stack()`. This function\ntakes a type as input, and passes this type to the struct definition of our\n`Stack` object. The data member `items` is now, an array of type `T`, which is the\ndata type that we have provided as input to the function. The function argument\n`val` in the `push()` function is now a value of type `T` too.\n\nWe can just provide a data type to this function, and it will create a definition of a\n`Stack` object that can store values of the data type that we have provided. In the example below, we are creating\nthe definition of a\n`Stack` object that can store `u8` values in it. This definition is stored at the `Stacku8` object.\nThis `Stacku8` object becomes our new struct, it is the struct that we are going to use\nto create our `Stack` object.\n\n\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nconst allocator = gpa.allocator();\nconst Stacku8 = Stack(u8);\nvar stack = try Stacku8.init(allocator, 10);\ndefer stack.deinit();\ntry stack.push(1);\ntry stack.push(2);\ntry stack.push(3);\ntry stack.push(4);\ntry stack.push(5);\ntry stack.push(6);\n\nstd.debug.print(\"Stack len: {d}\\n\", .{stack.length});\nstd.debug.print(\"Stack capacity: {d}\\n\", .{stack.capacity});\n\nstack.pop();\nstd.debug.print(\"Stack len: {d}\\n\", .{stack.length});\nstack.pop();\nstd.debug.print(\"Stack len: {d}\\n\", .{stack.length});\nstd.debug.print(\n \"Stack state: {any}\\n\",\n .{stack.items[0..stack.length]}\n);\n```\n:::\n\n\n\n\n```\nStack len: 6\nStack capacity: 10\nStack len: 5\nStack len: 4\nStack state: { 1, 2, 3, 4, 0, 0, 0, 0, 0, 0 }\n```\n\nEvery generic data structure in the Zig Standard Library (`ArrayList`, `HashMap`, `SinlyLinkedList`, etc.)\nis implemented through this logic. They use a generic function to create the struct definition that can work\nwith the data type that you provided as input.\n\n\n\n\n## Conclusion\n\nThe full source code of the stack structure discussed in this chapter is freely available at the official\nrepository of this book. Just checkout the [`stack.zig`](https://github.com/pedropark99/zig-book/tree/main/ZigExamples/data-structures/stack.zig)[^zig-stack]\nfor the `u32` version of our stack,\nand the [`generic_stack.zig`](https://github.com/pedropark99/zig-book/tree/main/ZigExamples/data-structures/generic_stack.zig)[^zig-stack2]\nfor the generic version, available inside the `ZigExamples` folder of the repository.\n\n\n[^zig-stack]: \n[^zig-stack2]: \n\n", "supporting": [ "10-stack-project_files" ], diff --git a/docs/Chapters/01-base64.html b/docs/Chapters/01-base64.html index 6760e07..fe7ecaf 100644 --- a/docs/Chapters/01-base64.html +++ b/docs/Chapters/01-base64.html @@ -694,7 +694,7 @@

Section 3.2. So, if you are not familiar with them, I highly recommend you to comeback to that section, and read it. By looking at the encode() function, you will see that we use this allocator object to allocate enough memory to store the output of encoding process.

+

I described everything you need to know about allocator objects at Section 3.3. So, if you are not familiar with them, I highly recommend you to comeback to that section, and read it. By looking at the encode() function, you will see that we use this allocator object to allocate enough memory to store the output of encoding process.

The main for loop in the function is responsible for iterating through the entire input string. In every iteration, we use a count variable to count how many iterations we had at the moment. When count reaches 3, then, we try to encode the 3 characters (or bytes) that we have accumulated in the temporary buffer object (buf).

After encoding these 3 characters and storing the result in the output variable, we reset the count variable to zero, and start to count again on the next iteration of the loop. If the loop hits the end of the string, and, the count variable is less than 3, then, it means that the temporary buffer contains the last 1 or 2 bytes from the input. That is why we have two if statements after the for loop. To deal which each possible case.

diff --git a/docs/Chapters/01-memory.html b/docs/Chapters/01-memory.html index d31c436..568c1e4 100644 --- a/docs/Chapters/01-memory.html +++ b/docs/Chapters/01-memory.html @@ -307,17 +307,18 @@

Table of contents

  • 3.1.5 Heap
  • 3.1.6 Summary
  • -
  • 3.2 Allocators +
  • 3.2 Stack overflows
  • +
  • 3.3 Allocators
  • @@ -480,7 +481,7 @@

    Unlike stack memory, heap memory is allocated explicitly by programmers and it won’t be deallocated until it is explicitly freed (Chen and Guo 2022).

    -

    To store an object in the heap, you, the programmer, needs to explicitly tells Zig to do so, by using an allocator to allocate some space in the heap. At Section 3.2, I will present how you can use allocators to allocate memory in Zig.

    +

    To store an object in the heap, you, the programmer, needs to explicitly tells Zig to do so, by using an allocator to allocate some space in the heap. At Section 3.3, I will present how you can use allocators to allocate memory in Zig.

    @@ -508,37 +509,51 @@

    -

    3.2 Allocators

    +
    +

    3.2 Stack overflows

    +

    Allocating memory on the stack is generally faster than allocating it on the heap. But this better performance comes with many restrictions. We have already discussed many of these restrictions of the stack at Section 3.1.4. But there is one more important limitation that I want to talk about, which is the size of the stack itself.

    +

    The stack is limited in size. This size varies from computer to computer, and it depends on a lot of things (the computer architecture, the operating system, etc.). Nevertheless, this size is usually not that big. This is why we normally use the stack to store only temporary and small objects in memory.

    +

    In essence, if you try to make an allocation on the stack, that is so big that exceeds the stack size limit, a stack overflow happens, and your program just crashes as a result of that. In other words, a stack overflow happens when you attempt to use more space than is available on the stack.

    +

    This type of problem is very similar to a buffer overflow, i.e. you are trying to use more space than is available in the “buffer object”. However, a stack overflow always cause your program to crash, while a buffer overflow not always cause your program to crash (although it often does).

    +

    You can see an example of a stack overflow in the example below. We are trying to allocate a very big array of u64 values on the stack. You can see below that this program does not run succesfully, because it crashed with a “segmentation fault” error message.

    +
    +
    var very_big_alloc: [1000 * 1000 * 24]u64 = undefined;
    +@memset(very_big_alloc[0..], 0);
    +
    +
    Segmentation fault (core dumped)
    +

    This segmentation fault error is a result of the stack overflow that was caused by the big memory allocation made on the stack, to store the very_big_alloc object. This is why very big objects are usually stored on the heap, instead of the stack.

    +
    +
    +

    3.3 Allocators

    One key aspect about Zig, is that there are “no hidden-memory allocations” in Zig. What that really means, is that “no allocations happen behind your back in the standard library” (Sobeston 2024).

    This is a known problem, especially in C++. Because in C++, there are some operators that do allocate memory behind the scene, and there is no way for you to known that, until you actually read the source code of these operators, and find the memory allocation calls. Many programmers find this behaviour annoying and hard to keep track of.

    But, in Zig, if a function, an operator, or anything from the standard library needs to allocate some memory during its execution, then, this function/operator needs to receive (as input) an allocator provided by the user, to actually be able to allocate the memory it needs.

    This creates a clear distinction between functions that “do not” from those that “actually do” allocate memory. Just look at the arguments of this function. If a function, or operator, have an allocator object as one of its inputs/arguments, then, you know for sure that this function/operator will allocate some memory during its execution.

    -

    An example is the allocPrint() function from the Zig standard library. With this function, you can write a new string using format specifiers. So, this function is, for example, very similar to the function sprintf() in C. In order to write such new string, the allocPrint() function needs to allocate some memory to store the output string.

    -

    That is why, the first argument of this function is an allocator object that you, the user/programmer, gives as input to the function. In the example below, I am using the GeneralPurposeAllocator() as my allocator object. But I could easily use any other type of allocator object from the Zig standard library.

    +

    An example is the allocPrint() function from the Zig Standard Library. With this function, you can write a new string using format specifiers. So, this function is, for example, very similar to the function sprintf() in C. In order to write such new string, the allocPrint() function needs to allocate some memory to store the output string.

    +

    That is why, the first argument of this function is an allocator object that you, the user/programmer, gives as input to the function. In the example below, I am using the GeneralPurposeAllocator() as my allocator object. But I could easily use any other type of allocator object from the Zig Standard Library.

    -
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    -const allocator = gpa.allocator();
    -const name = "Pedro";
    -const output = try std.fmt.allocPrint(
    -    allocator,
    -    "Hello {s}!!!",
    -    .{name}
    -);
    -try stdout.print("{s}\n", .{output});
    +
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    +const allocator = gpa.allocator();
    +const name = "Pedro";
    +const output = try std.fmt.allocPrint(
    +    allocator,
    +    "Hello {s}!!!",
    +    .{name}
    +);
    +try stdout.print("{s}\n", .{output});
    Hello Pedro!!!

    You get a lot of control over where and how much memory this function can allocate. Because it is you, the user/programmer, that provides the allocator for the function to use. This makes “total control” over memory management easier to achieve in Zig.

    -
    -

    3.2.1 What are allocators?

    -

    Allocators in Zig are objects that you can use to allocate memory for your program. They are similar to the memory allocating functions in C, like malloc() and calloc(). So, if you need to use more memory than you initially have, during the execution of your program, you can simply ask for more memory using an allocator.

    -

    Zig offers different types of allocators, and they are usually available through the std.heap module of the standard library. So, just import the Zig standard library into your Zig module (with @import("std")), and you can start using these allocators in your code.

    +
    +

    3.3.1 What are allocators?

    +

    Allocators in Zig are objects that you can use to allocate memory for your program. They are similar to the memory allocating functions in C, like malloc() and calloc(). So, if you need to use more memory than you initially have, during the execution of your program, you can simply ask for more memory by using an allocator object.

    +

    Zig offers different types of allocators, and they are usually available through the std.heap module of the standard library. Thus, just import the Zig Standard Library into your Zig module (with @import("std")), and you can start using these allocators in your code.

    Furthermore, every allocator object is built on top of the Allocator interface in Zig. This means that, every allocator object you find in Zig must have the methods alloc(), create(), free() and destroy(). So, you can change the type of allocator you are using, but you don’t need to change the function calls to the methods that do the memory allocation (and the free memory operations) for your program.

    -
    -

    3.2.2 Why you need an allocator?

    +
    +

    3.3.2 Why you need an allocator?

    As we described at Section 3.1.4, everytime you make a function call in Zig, a space in the stack is reserved for this function call. But the stack have a key limitation which is: every object stored in the stack have a known fixed length.

    But in reality, there are two very common instances where this “fixed length limitation” of the stack is a deal braker:

      @@ -547,10 +562,10 @@

      Section 3.1.4, you cannot do that if this local object is stored in the stack. However, if this object is stored in the heap, then, you can return a pointer to this object at the end of the function. Because you (the programmer) control the lifetime of any heap memory that you allocate. You decide when this memory get’s destroyed/freed.

      These are common situations where the stack is not good for. That is why you need a different memory management strategy to store these objects inside your function. You need to use a memory type that can grow together with your objects, or that you can control the lifetime of this memory. The heap fit this description.

      -

      Allocating memory on the heap is commonly known as dynamic memory management. As the objects you create grow in size during the execution of your program, you grow the amount of memory you have by allocating more memory in the heap to store these objects. And you that in Zig, by using an allocator object.

      +

      Allocating memory on the heap is commonly known as dynamic memory management. As the objects you create grow in size during the execution of your program, you grow the amount of memory you have by allocating more memory in the heap to store these objects. And you do that in Zig, by using an allocator object.

    -
    -

    3.2.3 The different types of allocators

    +
    +

    3.3.3 The different types of allocators

    At the moment of the writing of this book, in Zig, we have 6 different allocators available in the standard library:

    • GeneralPurposeAllocator().
    • @@ -561,116 +576,134 @@

      -

      3.2.4 General-purpose allocators

      +
      +

      3.3.4 General-purpose allocators

      The GeneralPurposeAllocator(), as the name suggests, is a “general purpose” allocator. You can use it for every type of task. In the example below, I’m allocating enough space to store a single integer in the object some_number.

      -
      const std = @import("std");
      -
      -pub fn main() !void {
      -    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
      -    const allocator = gpa.allocator();
      -    const some_number = try allocator.create(u32);
      -    defer allocator.destroy(some_number);
      -
      -    some_number.* = @as(u32, 45);
      -}
      +
      const std = @import("std");
      +
      +pub fn main() !void {
      +    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
      +    const allocator = gpa.allocator();
      +    const some_number = try allocator.create(u32);
      +    defer allocator.destroy(some_number);
      +
      +    some_number.* = @as(u32, 45);
      +}

      While useful, you might want to use the c_allocator(), which is a alias to the C standard allocator malloc(). So, yes, you can use malloc() in Zig if you want to. Just use the c_allocator() from the Zig standard library. However, if you do use c_allocator(), you must link to Libc when compiling your source code with the zig compiler, by including the flag -lc in your compilation process. If you do not link your source code to Libc, Zig will not be able to find the malloc() implementation in your system.

      -
      -

      3.2.5 Page allocator

      +
      +

      3.3.5 Page allocator

      The page_allocator() is an allocator that allocates full pages of memory in the heap. In other words, every time you allocate memory with page_allocator(), a full page of memory in the heap is allocated, instead of just a small piece of it.

      The size of this page depends on the system you are using. Most systems use a page size of 4KB in the heap, so, that is the amount of memory that is normally allocated in each call by page_allocator(). That is why, page_allocator() is considered a fast, but also “wasteful” allocator in Zig. Because it allocates a big amount of memory in each call, and you most likely will not need that much memory in your program.

      -
      -

      3.2.6 Buffer allocators

      -

      The FixedBufferAllocator() and ThreadSafeFixedBufferAllocator() are allocator objects that work with a fixed sized buffer that is stored in the stack. So these two allocators only allocates memory in the stack. This also means that, in order to use these allocators, you must first create a buffer object, and then, give this buffer as an input to these allocators.

      -

      In the example below, I am creating a buffer object that is 10 elements long. Notice that I give this buffer object to the FixedBufferAllocator() constructor. Now, because this buffer object is 10 elements long, this means that I am limited to this space. I cannot allocate more than 10 elements with this allocator object. If I try to allocate more than that, the alloc() method will return an OutOfMemory error value.

      +
      +

      3.3.6 Buffer allocators

      +

      The FixedBufferAllocator() and ThreadSafeFixedBufferAllocator() are allocator objects that work with a fixed sized buffer object at the back. In other words, they use a fixed sized buffer object as the basis for the memory. When you ask these allocator objects to allocate some memory for you, they are essentially reserving some amount of space inside this fixed sized buffer object for you to use.

      +

      This means that, in order to use these allocators, you must first create a buffer object in your code, and then, give this buffer object as an input to these allocators.

      +

      This also means that, these allocator objects can allocate memory both in the stack or in the heap. Everything depends on where the buffer object that you provide lives. If this buffer object lives in the stack, then, the memory allocated is “stack-based”. But if it lives on the heap, then, the memory allocated is “heap-based”.

      +

      In the example below, I’m creating a buffer object on the stack that is 10 elements long. Notice that I give this buffer object to the FixedBufferAllocator() constructor. Now, because this buffer object is 10 elements long, this means that I am limited to this space. I cannot allocate more than 10 elements with this allocator object. If I try to allocate more than that, the alloc() method will return an OutOfMemory error value.

      -
      var buffer: [10]u8 = undefined;
      -for (0..buffer.len) |i| {
      -    buffer[i] = 0; // Initialize to zero
      -}
      -
      -var fba = std.heap.FixedBufferAllocator.init(&buffer);
      -const allocator = fba.allocator();
      -const input = try allocator.alloc(u8, 5);
      -defer allocator.free(input);
      +
      var buffer: [10]u8 = undefined;
      +for (0..buffer.len) |i| {
      +    buffer[i] = 0; // Initialize to zero
      +}
      +
      +var fba = std.heap.FixedBufferAllocator.init(&buffer);
      +const allocator = fba.allocator();
      +const input = try allocator.alloc(u8, 5);
      +defer allocator.free(input);
      +
      +

      Remember, the memory allocated by these allocator objects can be either from the stack, or, from the heap. It all depends on where the buffer object that you provide lives. In the above example, the buffer object lives in the stack, and, therefore, the memory allocated is based in the stack. But what if it was based on the heap?

      +

      As we described at Section 3.2, one of the main reasons why you would use the heap, instead of the stack, is to allocate huge amounts of space to store very big objects. Thus, let’s suppose you wanted to use a very big buffer object as the basis for your allocator objects. You would have to allocate this very big buffer object on the heap. The example below demonstrates this case.

      +
      +
      const heap = std.heap.page_allocator;
      +const memory_buffer = try heap.alloc(
      +    u8, 100 * 1024 * 1024 // 100 MB memory
      +);
      +defer heap.free(memory_buffer);
      +var fba = std.heap.FixedBufferAllocator.init(
      +    memory_buffer
      +);
      +const allocator = fba.allocator();
      +
      +const input = try allocator.alloc(u8, 1000);
      +defer allocator.free(input);
      -
      -

      3.2.7 Arena allocator

      +
      +

      3.3.7 Arena allocator

      The ArenaAllocator() is an allocator object that takes a child allocator as input. The idea behind the ArenaAllocator() in Zig is similar to the concept of “arenas” in the programming language Go5. It is an allocator object that allows you to allocate memory as many times you want, but free all memory only once. In other words, if you have, for example, called 5 times the method alloc() of an ArenaAllocator() object, you can free all the memory you allocated over these 5 calls at once, by simply calling the deinit() method of the same ArenaAllocator() object.

      If you give, for example, a GeneralPurposeAllocator() object as input to the ArenaAllocator() constructor, like in the example below, then, the allocations you perform with alloc() will actually be made with the underlying object GeneralPurposeAllocator() that was passed. So, with an arena allocator, any new memory you ask for is allocated by the child allocator. The only thing that an arena allocator really do is helping you to free all the memory you allocated multiple times with just a single command. In the example below, I called alloc() 3 times. So, if I did not used an arena allocator, then, I would need to call free() 3 times to free all the allocated memory.

      -
      var gpa = std.heap.GeneralPurposeAllocator(.{}){};
      -var aa = std.heap.ArenaAllocator.init(gpa.allocator());
      -defer aa.deinit();
      -const allocator = aa.allocator();
      -
      -const in1 = allocator.alloc(u8, 5);
      -const in2 = allocator.alloc(u8, 10);
      -const in3 = allocator.alloc(u8, 15);
      -_ = in1; _ = in2; _ = in3;
      +
      var gpa = std.heap.GeneralPurposeAllocator(.{}){};
      +var aa = std.heap.ArenaAllocator.init(gpa.allocator());
      +defer aa.deinit();
      +const allocator = aa.allocator();
      +
      +const in1 = allocator.alloc(u8, 5);
      +const in2 = allocator.alloc(u8, 10);
      +const in3 = allocator.alloc(u8, 15);
      +_ = in1; _ = in2; _ = in3;
      -
      -

      3.2.8 The alloc() and free() methods

      +
      +

      3.3.8 The alloc() and free() methods

      In the code example below, we are accessing the stdin, which is the standard input channel, to receive an input from the user. We read the input given by the user with the readUntilDelimiterOrEof() method.

      Now, after reading the input of the user, we need to store this input somewhere in our program. That is why I use an allocator in this example. I use it to allocate some amount of memory to store this input given by the user. More specifically, the method alloc() of the allocator object is used to allocate an array capable of storing 50 u8 values.

      Notice that this alloc() method receives two inputs. The first one, is a type. This defines what type of values the allocated array will store. In the example below, we are allocating an array of unsigned 8-bit integers (u8). But you can create an array to store any type of value you want. Next, on the second argument, we define the size of the allocated array, by specifying how much elements this array will contain. In the case below, we are allocating an array of 50 elements.

      At Section 1.8 we described that strings in Zig are simply arrays of characters. Each character is represented by an u8 value. So, this means that the array that was allocated in the object input is capable of storing a string that is 50-characters long.

      So, in essence, the expression var input: [50]u8 = undefined would create an array for 50 u8 values in the stack of the current scope. But, you can allocate the same array in the heap by using the expression var input = try allocator.alloc(u8, 50).

      -
      const std = @import("std");
      -const stdin = std.io.getStdIn();
      -
      -pub fn main() !void {
      -    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
      -    const allocator = gpa.allocator();
      -    var input = try allocator.alloc(u8, 50);
      -    defer allocator.free(input);
      -    for (0..input.len) |i| {
      -        input[i] = 0; // initialize all fields to zero.
      -    }
      -    // read user input
      -    const input_reader = stdin.reader();
      -    _ = try input_reader.readUntilDelimiterOrEof(
      -        input,
      -        '\n'
      -    );
      -    std.debug.print("{s}\n", .{input});
      -}
      +
      const std = @import("std");
      +const stdin = std.io.getStdIn();
      +
      +pub fn main() !void {
      +    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
      +    const allocator = gpa.allocator();
      +    var input = try allocator.alloc(u8, 50);
      +    defer allocator.free(input);
      +    for (0..input.len) |i| {
      +        input[i] = 0; // initialize all fields to zero.
      +    }
      +    // read user input
      +    const input_reader = stdin.reader();
      +    _ = try input_reader.readUntilDelimiterOrEof(
      +        input,
      +        '\n'
      +    );
      +    std.debug.print("{s}\n", .{input});
      +}

      Also, notice that in this example, we use the defer keyword (which I described at Section 2.1.3) to run a small piece of code at the end of the current scope, which is the expression allocator.free(input). When you execute this expression, the allocator will free the memory that it allocated for the input object.

      We have talked about this at Section 3.1.5. You should always explicitly free any memory that you allocate using an allocator! You do that by using the free() method of the same allocator object you used to allocate this memory. The defer keyword is used in this example only to help us execute this free operation at the end of the current scope.

      -
      -

      3.2.9 The create() and destroy() methods

      +
      +

      3.3.9 The create() and destroy() methods

      With the alloc() and free() methods, you can allocate memory to store multiple elements at once. In other words, with these methods, we always allocate an array to store multiple elements at once. But what if you need enough space to store just a single item? Should you allocate an array of a single element through alloc()?

      The answer is no! In this case, you should use the create() method of the allocator object. Every allocator object offers the create() and destroy() methods, which are used to allocate and free memory for a single item, respectively.

      So, in essence, if you want to allocate memory to store an array of elements, you should use alloc() and free(). But if you need to store just a single item, then, the create() and destroy() methods are ideal for you.

      In the example below, I’m defining a struct to represent an user of some sort. It could be an user for a game, or a software to manage resources, it doesn’t mater. Notice that I use the create() method this time, to store a single User object in the program. Also notice that I use the destroy() method to free the memory used by this object at the end of the scope.

      -
      const std = @import("std");
      -const User = struct {
      -    id: usize,
      -    name: []const u8,
      -
      -    pub fn init(id: usize, name: []const u8) User {
      -        return .{ .id = id, .name = name };
      -    }
      -};
      -
      -pub fn main() !void {
      -    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
      -    const allocator = gpa.allocator();
      -    const user = try allocator.create(User);
      -    defer allocator.destroy(user);
      -
      -    user.* = User.init(0, "Pedro");
      -}
      +
      const std = @import("std");
      +const User = struct {
      +    id: usize,
      +    name: []const u8,
      +
      +    pub fn init(id: usize, name: []const u8) User {
      +        return .{ .id = id, .name = name };
      +    }
      +};
      +
      +pub fn main() !void {
      +    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
      +    const allocator = gpa.allocator();
      +    const user = try allocator.create(User);
      +    defer allocator.destroy(user);
      +
      +    user.* = User.init(0, "Pedro");
      +}
      diff --git a/docs/Chapters/01-zig-weird.html b/docs/Chapters/01-zig-weird.html index 1bee925..2e8c246 100644 --- a/docs/Chapters/01-zig-weird.html +++ b/docs/Chapters/01-zig-weird.html @@ -932,7 +932,7 @@

      In contrast, the Zig language is not a memory safe language by default. There are some memory safety features that you get for free in Zig, especially in arrays and pointer objects. But there are other tools offered by the language, that are not used by default. In other words, the zig compiler does not obligates you to use such tools.

      The tools listed below are related to memory safety. That is, they help you to achieve memory safety in your Zig code:

        -
      • defer allows you to keep free operations phisically close to allocations. This helps you to avoid memory leaks, “use after free”, and also “double-free” problems. Furthermore, it also keeps free operations logically tied to the end of the current scope, which greatly reduces the mental overhead about object lifetime.
      • +
      • defer allows you to keep free operations physically close to allocations. This helps you to avoid memory leaks, “use after free”, and also “double-free” problems. Furthermore, it also keeps free operations logically tied to the end of the current scope, which greatly reduces the mental overhead about object lifetime.
      • errdefer helps you to guarantee that your program frees the allocated memory, even if a runtime error occurs.
      • pointers and objects are non-nullable by default. This helps you to avoid memory problems that might arise from de-referencing null pointers.
      • Zig offers some native types of allocators (called “testing allocators”) that can detect memory leaks and double-frees. These types of allocators are widely used on unit tests, so they transform your unit tests into a weapon that you can use to detect memory problems in your code.
      • diff --git a/docs/Chapters/03-structs.html b/docs/Chapters/03-structs.html index 328132d..6dcac4b 100644 --- a/docs/Chapters/03-structs.html +++ b/docs/Chapters/03-structs.html @@ -288,7 +288,11 @@

        Table of contents

      • 2.1.6 While loops
      • 2.1.7 Using break and continue
      -
    • 2.2 Function parameters are immutable
    • +
    • 2.2 Function parameters are immutable +
    • 2.3 Structs and OOP
      • 2.3.1 Anonymous struct literals
      • @@ -630,9 +634,9 @@

        2.2 Function parameters are immutable

        We have already discussed a lot of the syntax behind function declarations in Section 1.2.2 and Section 1.2.3. But I want to emphasize a curious fact about function parameters (a.k.a. function arguments) in Zig. In summary, function parameters are immutable in Zig.

        -

        Take the code example below, where we declare a simple function that just tries to add some amount to the input integer, and returns the result back. But if you look closely at the body of this add2() function, you will notice that we try to save the result back into the x function argument.

        +

        Take the code example below, where we declare a simple function that just tries to add some amount to the input integer, and returns the result back. If you look closely at the body of this add2() function, you will notice that we try to save the result back into the x function argument.

        In other words, this function not only use the value that it received through the function argument x, but it also tries to change the value of this function argument, by assigning the addition result into x. However, function arguments in Zig are immutable. You cannot change their values, or, you cannot assign values to them inside the body’s function.

        -

        This is the reason why, the code example below do not compile successfully. If you try to compile this code example, you get a compile error warning you that you are trying to change the value of a immutable (i.e. constant) object.

        +

        This is the reason why, the code example below do not compile successfully. If you try to compile this code example, you will get a compile error message about “trying to change the value of a immutable (i.e. constant) object”.

        const std = @import("std");
         fn add2(x: u32) u32 {
        @@ -648,11 +652,16 @@ 

        t.zig:3:5: error: cannot assign to constant x = x + 2; ^

        +
        +

        2.2.1 A free optimization

        If a function argument receives as input an object whose data type is any of the primitive types that we have listed in Section 1.5, this object is always passed by value to the function. In other words, this object is copied into the function stack frame.

        However, if the input object have a more complex data type, for example, it might be a struct instance, or an array, or an union value, etc., in cases like that, the zig compiler will take the liberty of deciding for you which strategy is best. Thus, the zig compiler will pass your object to the function either by value, or by reference. The compiler will always choose the strategy that is faster for you. This optimization that you get for free is possible only because function arguments are immutable in Zig.

        +
        +
        +

        2.2.2 How to overcome this barrier

        There are some situations where you might need to change the value of your function argument directly inside the function’s body. This happens more often when we are passing C structs as inputs to Zig functions.

        -

        In a situation like this, you can overcome this barrier of immutable function arguments, by simply taking the lead, and explicitly choosing to pass the object by reference to the function. That is, instead of depending on the zig compiler to decide which strategy is best, you have to explicitly mark the function argument as a pointer. This way, we are telling the compiler that this function argument will be passed by reference to the function.

        -

        By making it a pointer, we can finally alter the value of this function argument directly inside the body of the add2() function. You can see that the code example below compiles successfully.

        +

        In a situation like this, you can overcome this barrier by using a pointer. In other words, instead of passing a value as input to the argument, you can pass a “pointer to value” instead. You can change the value that the pointer points to, by dereferencing it.

        +

        Therefore, if we take our previous add2() example, we can change the value of the function argument x inside the function’s body by marking the x argument as a “pointer to a u32 value” (i.e. *u32 data type), instead of a u32 value. By making it a pointer, we can finally alter the value of this function argument directly inside the body of the add2() function. You can see that the code example below compiles successfully.

        const std = @import("std");
         fn add2(x: *u32) void {
        @@ -667,6 +676,8 @@ 

        }

        Result: 6
        +

        Even in this code example above, the x argument is still immutable. Which means that the pointer itself is immutable. Therefore, you cannot change the memory address that it points to. However, you can dereference the pointer to access the value that it points to, and also, to change this value, if you need to.

        +
    • 2.3 Structs and OOP

      @@ -674,7 +685,7 @@

      With struct definitions, you can create (or define) a new data type in Zig. These struct definitions work the same way as they work in C. You give a name to this new struct (or, to this new data type you are creating), then, you list the data members of this new struct. You can also register functions inside this struct, and they become the methods of this particular struct (or data type), so that, every object that you create with this new type, will always have these methods available and associated with them.

      In C++, when we create a new class, we normally have a constructor method (or, a constructor function) which is used to construct (or, to instantiate) every object of this particular class, and we also have a destructor method (or a destructor function), which is the function responsible for destroying every object of this class.

      In Zig, we normally declare the constructor and the destructor methods of our structs, by declaring an init() and a deinit() methods inside the struct. This is just a naming convention that you will find across the entire Zig Standard Library. So, in Zig, the init() method of a struct is normally the constructor method of the class represented by this struct. While the deinit() method is the method used for destroying an existing instance of that struct.

      -

      The init() and deinit() methods are both used extensively in Zig code, and you will see both of them being used when we talk about allocators in Section 3.2. But, as another example, let’s build a simple User struct to represent an user of some sort of system.

      +

      The init() and deinit() methods are both used extensively in Zig code, and you will see both of them being used when we talk about allocators in Section 3.3. But, as another example, let’s build a simple User struct to represent an user of some sort of system.

      If you look at the User struct below, you can see the struct keyword. Notice the data members of this struct, id, name and email. Every data member have its type explicitly annotated, with the colon character (:) syntax that we described earlier in Section 1.2.2. But also notice that every line in the struct body that describes a data member, ends with a comma character (,). So every time you declare a data member in your Zig code, always end the line with a comma character, instead of ending it with the traditional semicolon character (;).

      Next, we have registered an init() function as a method of this User struct. This init() method is the constructor method that we will use to instantiate every new User object. That is why this init() function returns a new User object as result.

      @@ -812,26 +823,26 @@

      return m.sqrt(xd + yd + zd); } - pub fn double(self: *Vec3) void { + pub fn twice(self: *Vec3) void { self.x = self.x * 2.0; self.y = self.y * 2.0; self.z = self.z * 2.0; } };

    -

    Notice in the code example above that we have added a new method to our Vec3 struct named double(). This method doubles the coordinate values of our vector object. In the case of the double() method, we annotated the self argument as *Vec3, indicating that this argument receives a pointer (or a reference, if you prefer to call it this way) to a Vec3 object as input.

    +

    Notice in the code example above that we have added a new method to our Vec3 struct named twice(). This method doubles the coordinate values of our vector object. In the case of the twice() method, we annotated the self argument as *Vec3, indicating that this argument receives a pointer (or a reference, if you prefer to call it this way) to a Vec3 object as input.

    var v3 = Vec3 {
         .x = 4.2, .y = 2.4, .z = 0.9
     };
    -v3.double();
    +v3.twice();
     std.debug.print("Doubled: {d}\n", .{v3.x});
    Doubled: 8.4
    -

    Now, if you change the self argument in this double() method to self: Vec3, like in the distance() method, you will get the compiler error exposed below as result. Notice that this error message is showing a line from the double() method body, indicating that you cannot alter the value of the x data member.

    +

    Now, if you change the self argument in this twice() method to self: Vec3, like in the distance() method, you will get the compiler error exposed below as result. Notice that this error message is showing a line from the twice() method body, indicating that you cannot alter the value of the x data member.

    // If we change the function signature of double to:
    -    pub fn double(self: Vec3) void {
    + pub fn twice(self: Vec3) void {
    t.zig:16:13: error: cannot assign to constant
             self.x = self.x * 2.0;
    diff --git a/docs/Chapters/09-data-structures.html b/docs/Chapters/09-data-structures.html
    index 30e44a4..c3cf762 100644
    --- a/docs/Chapters/09-data-structures.html
    +++ b/docs/Chapters/09-data-structures.html
    @@ -408,7 +408,7 @@ 

    11.1.2 Creating an ArrayList object

    -

    In order to use ArrayList, you must provide an allocator object to it. Remember, Zig does not have a default memory allocator. And as I described at Section 3.2, all memory allocations must be done by an allocator object that you define, that you have control over. In our example here, I’m going to use a general purpose allocator, but you can use any other allocator of your preference.

    +

    In order to use ArrayList, you must provide an allocator object to it. Remember, Zig does not have a default memory allocator. And as I described at Section 3.3, all memory allocations must be done by an allocator object that you define, that you have control over. In our example here, I’m going to use a general purpose allocator, but you can use any other allocator of your preference.

    When you initialize an ArrayList object, you must provide the data type of the elements of the array. In other words, this defines the type of data that this array (or container) will store. Therefore, if I provide the u8 type to it, then, I will create a dynamic array of u8 values. However, if I provide a struct that I have defined instead, like the struct User from Section 2.3, then, a dynamic array of User values will be created. In the example below, with the expression ArrayList(u8) we are creating a dynamic array of u8 values.

    After you provide the data type of the elements of the array, you can initialize an ArrayList object by either using the init() or the initCapacity() methods. The former method receives only the allocator object as input, while the latter method receives both the allocator object and a capacity number as inputs. With the latter method, you not only initialize the struct, but you also set the starting capacity of the allocated array.

    Using the initCapacity() method is the preferred way to initialize your dynamic array. Because reallocations, or, in other words, the process of expanding the capacity of the array, is always a high cost operation. You should take any possible opportunity to avoid reallocations in your array. If you know how much space your array needs to occupy at the beginning, you should always use initCapacity() to create your dynamic array.

    diff --git a/docs/Chapters/09-error-handling.html b/docs/Chapters/09-error-handling.html index 206068d..af0d5a3 100644 --- a/docs/Chapters/09-error-handling.html +++ b/docs/Chapters/09-error-handling.html @@ -551,7 +551,7 @@

    <

    10.2.4 The errdefer keyword

    A common pattern in C programs in general, is to clean resources when an error occurs during the execution of the program. In other words, one common way to handle errors, is to perform “cleanup actions” before we exit our program. This guarantees that a runtime error does not make our program to leak resources of the system.

    The errdefer keyword is a tool to perform such “cleanup actions” in hostile situations. This keyword is commonly used to clean (or to free) allocated resources, before the execution of our program get’s stopped because of an error value being generated.

    -

    The basic idea is to provide an expression to the errdefer keyword. Then, errdefer executes this expression if, and only if, an error occurs during the execution of the current scope. In the example below, we are using an allocator object (that we have presented at Section 3.2) to create a new User object. If we are successful in creating and registering this new user, this create_user() function will return this new User object as its return value.

    +

    The basic idea is to provide an expression to the errdefer keyword. Then, errdefer executes this expression if, and only if, an error occurs during the execution of the current scope. In the example below, we are using an allocator object (that we have presented at Section 3.3) to create a new User object. If we are successful in creating and registering this new user, this create_user() function will return this new User object as its return value.

    However, if for some reason, an error value is generated by some expression that is after the errdefer line, for example, in the db.add(user) expression, the expression registered by errdefer get’s executed before the error value is returned from the function, and before the program enters in panic mode and stops the current execution.

    fn create_user(db: Database, allocator: Allocator) !User {
    diff --git a/docs/Chapters/10-stack-project.html b/docs/Chapters/10-stack-project.html
    index 5cea743..bd314e7 100644
    --- a/docs/Chapters/10-stack-project.html
    +++ b/docs/Chapters/10-stack-project.html
    @@ -360,21 +360,21 @@ 

    12.1.1 Applying over a function argument

    When you apply the comptime keyword on a function argument, you are saying to the zig compiler that the value assigned to that particular function argument must be known at compile-time. We explained in details at Section 3.1.1 what exactly “value known at compile-time” means, so, in case you have doubts about this idea, comeback to that section.

    Now let’s think about the consequences of this idea. First of all, we are imposing a limit, or, a requirement to that particular function argument. If the programmer accidentally tries to give a value to this function argument that is not known at compile time, the zig compiler will notice this problem, and as a consequence, it will raise a compilation error saying that it cannot compile your program. Because you are providing a value that is “runtime known” to a function argument that must be “compile-time known”.

    -

    Take a look at this very simple example below, where we define a double() function, that simply doubles the input value named num. Notice that we use the comptime keyword before the name of the function argument. This keyword is marking the function argument num as a “comptime argument”.

    -

    That is a function argument whose value must be compile-time known. This is why the expression double(5678) is valid, and no compilation errors are raised. Because the value 5678 is compile-time known, so this is the expected behaviour for this function.

    +

    Take a look at this very simple example below, where we define a twice() function, that simply doubles the input value named num. Notice that we use the comptime keyword before the name of the function argument. This keyword is marking the function argument num as a “comptime argument”.

    +

    That is a function argument whose value must be compile-time known. This is why the expression twice(5678) is valid, and no compilation errors are raised. Because the value 5678 is compile-time known, so this is the expected behaviour for this function.

    -
    fn double(comptime num: u32) u32 {
    +
    fn twice(comptime num: u32) u32 {
         return num * 2;
     }
     test "test comptime" {
    -    _ = double(5678);
    +    _ = twice(5678);
     }

    But what if we provide a number that is not compile-time known to this function? For example, we might provide a different input value to this function depending on the target OS of our compilation process. The code example below demonstrates such case.

    -

    Because the value of the object n is determined at runtime, we cannot provide this object as input to the double() function. The zig compiler will not allow it, because we marked the num argument as a “comptime argument”. That is why the zig compiler raises the compile-time error exposed below:

    +

    Because the value of the object n is determined at runtime, we cannot provide this object as input to the twice() function. The zig compiler will not allow it, because we marked the num argument as a “comptime argument”. That is why the zig compiler raises the compile-time error exposed below:

    const builtin = @import("builtin");
    -fn double(comptime num: u32) u32 {
    +fn twice(comptime num: u32) u32 {
         return num * 2;
     }
     test "test comptime" {
    @@ -384,7 +384,7 @@ 

    } else { n = 5678; } - _ = double(n); + _ = twice(n); }

    t.zig:12:16: error: runtime-known argument passed to comptime parameter 
    diff --git a/docs/index.html b/docs/index.html index 8604d08..88fd776 100644 --- a/docs/index.html +++ b/docs/index.html @@ -7,7 +7,7 @@ - + Introduction to Zig