Skip to content

Commit

Permalink
Merge pull request #84 from pedropark99/dev-fix
Browse files Browse the repository at this point in the history
Fix several issues at once
  • Loading branch information
pedropark99 authored Oct 21, 2024
2 parents 34cdec5 + 9e929f9 commit 491ad4a
Show file tree
Hide file tree
Showing 17 changed files with 346 additions and 196 deletions.
107 changes: 93 additions & 14 deletions Chapters/01-memory.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -387,6 +387,47 @@ object you declare is stored:
1. the heap can only be accessed through allocators. If your object was not created through the `alloc()` or `create()` methods of an allocator object, then, he is most certainly not an object stored in the heap.


## Stack overflows {#sec-stack-overflow}

Allocating memory on the stack is generally faster than allocating it on the heap.
But this better performance comes with many restrictions. We have already discussed
many of these restrictions of the stack at @sec-stack. But there is one more important
limitation that I want to talk about, which is the size of the stack itself.

The stack is limited in size. This size varies from computer to computer, and it depends on
a lot of things (the computer architecture, the operating system, etc.). Nevertheless, this size is usually
not that big. This is why we normally use the stack to store only temporary and small objects in memory.

In essence, if you try to make an allocation on the stack, that is so big that exceeds the stack size limit,
a *stack overflow* happens, and your program just crashes as a result of that. In other words, a stack overflow happens when
you attempt to use more space than is available on the stack.

This type of problem is very similar to a *buffer overflow*, i.e. you are trying to use more space
than is available in the "buffer object". However, a stack overflow always cause your program to crash,
while a buffer overflow not always cause your program to crash (although it often does).

You can see an example of a stack overflow in the example below. We are trying to allocate a very big array of `u64` values
on the stack. You can see below that this program does not run succesfully, because it crashed
with a "segmentation fault" error message.

```{zig}
#| eval: false
#| build_type: "run"
#| auto_main: true
var very_big_alloc: [1000 * 1000 * 24]u64 = undefined;
@memset(very_big_alloc[0..], 0);
```

```
Segmentation fault (core dumped)
```

This segmentation fault error is a result of the stack overflow that was caused by the big
memory allocation made on the stack, to store the `very_big_alloc` object.
This is why very big objects are usually stored on the heap, instead of the stack.



## Allocators {#sec-allocators}

One key aspect about Zig, is that there are "no hidden-memory allocations" in Zig.
Expand All @@ -406,14 +447,14 @@ allocate memory. Just look at the arguments of this function.
If a function, or operator, have an allocator object as one of its inputs/arguments, then, you know for
sure that this function/operator will allocate some memory during its execution.

An example is the `allocPrint()` function from the Zig standard library. With this function, you can
An example is the `allocPrint()` function from the Zig Standard Library. With this function, you can
write a new string using format specifiers. So, this function is, for example, very similar to the function `sprintf()` in C.
In order to write such new string, the `allocPrint()` function needs to allocate some memory to store the
output string.

That is why, the first argument of this function is an allocator object that you, the user/programmer, gives
as input to the function. In the example below, I am using the `GeneralPurposeAllocator()` as my allocator
object. But I could easily use any other type of allocator object from the Zig standard library.
object. But I could easily use any other type of allocator object from the Zig Standard Library.

```{zig}
#| auto_main: true
Expand All @@ -430,24 +471,23 @@ try stdout.print("{s}\n", .{output});
```


You get a lot of control
over where and how much memory this function can allocate. Because it is you,
the user/programmer, that provides the allocator for the function to use.
You get a lot of control over where and how much memory this function can allocate.
Because it is you, the user/programmer, that provides the allocator for the function to use.
This makes "total control" over memory management easier to achieve in Zig.

### What are allocators?

Allocators in Zig are objects that you can use to allocate memory for your program.
They are similar to the memory allocating functions in C, like `malloc()` and `calloc()`.
So, if you need to use more memory than you initially have, during the execution of your program, you can simply ask
for more memory using an allocator.
for more memory by using an allocator object.

Zig offers different types of allocators, and they are usually available through the `std.heap` module of
the standard library. So, just import the Zig standard library into your Zig module (with `@import("std")`), and you can start
the standard library. Thus, just import the Zig Standard Library into your Zig module (with `@import("std")`), and you can start
using these allocators in your code.

Furthermore, every allocator object is built on top of the `Allocator` interface in Zig. This
means that, every allocator object you find in Zig must have the methods `alloc()`,
Furthermore, every allocator object is built on top of the `Allocator` interface in Zig.
This means that, every allocator object you find in Zig must have the methods `alloc()`,
`create()`, `free()` and `destroy()`. So, you can change the type of allocator you are using,
but you don't need to change the function calls to the methods that do the memory allocation
(and the free memory operations) for your program.
Expand Down Expand Up @@ -480,7 +520,7 @@ The heap fit this description.
Allocating memory on the heap is commonly known as dynamic memory management. As the objects you create grow in size
during the execution of your program, you grow the amount of memory
you have by allocating more memory in the heap to store these objects.
And you that in Zig, by using an allocator object.
And you do that in Zig, by using an allocator object.


### The different types of allocators
Expand Down Expand Up @@ -543,11 +583,20 @@ in each call, and you most likely will not need that much memory in your program
### Buffer allocators

The `FixedBufferAllocator()` and `ThreadSafeFixedBufferAllocator()` are allocator objects that
work with a fixed sized buffer that is stored in the stack. So these two allocators only allocates
memory in the stack. This also means that, in order to use these allocators, you must first
create a buffer object, and then, give this buffer as an input to these allocators.
work with a fixed sized buffer object at the back. In other words, they use a fixed sized buffer
object as the basis for the memory. When you ask these allocator objects to allocate some memory for you,
they are essentially reserving some amount of space inside this fixed sized buffer object for you to use.

In the example below, I am creating a `buffer` object that is 10 elements long.
This means that, in order to use these allocators, you must first create a buffer object in your code,
and then, give this buffer object as an input to these allocators.

This also means that, these allocator objects can allocate memory both in the stack or in the heap.
Everything depends on where the buffer object that you provide lives. If this buffer object lives
in the stack, then, the memory allocated is "stack-based". But if it lives on the heap, then,
the memory allocated is "heap-based".


In the example below, I'm creating a `buffer` object on the stack that is 10 elements long.
Notice that I give this `buffer` object to the `FixedBufferAllocator()` constructor.
Now, because this `buffer` object is 10 elements long, this means that I am limited to this space.
I cannot allocate more than 10 elements with this allocator object. If I try to
Expand All @@ -567,6 +616,36 @@ const input = try allocator.alloc(u8, 5);
defer allocator.free(input);
```

Remember, the memory allocated by these allocator objects can be either from
the stack, or, from the heap. It all depends on where the buffer object that you provide lives.
In the above example, the `buffer` object lives in the stack, and, therefore, the memory allocated
is based in the stack. But what if it was based on the heap?

As we described at @sec-stack-overflow, one of the main reasons why you would use the heap,
instead of the stack, is to allocate huge amounts of space to store very big objects.
Thus, let's suppose you wanted to use a very big buffer object as the basis for your
allocator objects. You would have to allocate this very big buffer object on the heap.
The example below demonstrates this case.

```{zig}
#| eval: false
#| build_type: "run"
#| auto_main: true
const heap = std.heap.page_allocator;
const memory_buffer = try heap.alloc(
u8, 100 * 1024 * 1024 // 100 MB memory
);
defer heap.free(memory_buffer);
var fba = std.heap.FixedBufferAllocator.init(
memory_buffer
);
const allocator = fba.allocator();
const input = try allocator.alloc(u8, 1000);
defer allocator.free(input);
```



### Arena allocator {#sec-arena-allocator}

Expand Down
4 changes: 2 additions & 2 deletions Chapters/01-zig-weird.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -1316,7 +1316,7 @@ that are very useful to use when working with strings. Most notably:
- `std.mem.splitScalar()`: to split a string into an array of substrings given a delimiter value.
- `std.mem.splitSequence()`: to split a string into an array of substrings given a substring delimiter.
- `std.mem.startsWith()`: to check if string starts with substring.
- `std.mem.endsWith()`: to check if string starts with substring.
- `std.mem.endsWith()`: to check if string ends with substring.
- `std.mem.trim()`: to remove specific values from both start and end of the string.
- `std.mem.concat()`: to concatenate strings together.
- `std.mem.count()`: to count the occurrences of substring in the string.
Expand Down Expand Up @@ -1465,7 +1465,7 @@ In other words, the `zig` compiler does not obligates you to use such tools.
The tools listed below are related to memory safety. That is, they help you to achieve
memory safety in your Zig code:

- `defer` allows you to keep free operations phisically close to allocations. This helps you to avoid memory leaks, "use after free", and also "double-free" problems. Furthermore, it also keeps free operations logically tied to the end of the current scope, which greatly reduces the mental overhead about object lifetime.
- `defer` allows you to keep free operations physically close to allocations. This helps you to avoid memory leaks, "use after free", and also "double-free" problems. Furthermore, it also keeps free operations logically tied to the end of the current scope, which greatly reduces the mental overhead about object lifetime.
- `errdefer` helps you to guarantee that your program frees the allocated memory, even if a runtime error occurs.
- pointers and objects are non-nullable by default. This helps you to avoid memory problems that might arise from de-referencing null pointers.
- Zig offers some native types of allocators (called "testing allocators") that can detect memory leaks and double-frees. These types of allocators are widely used on unit tests, so they transform your unit tests into a weapon that you can use to detect memory problems in your code.
Expand Down
43 changes: 28 additions & 15 deletions Chapters/03-structs.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -606,7 +606,7 @@ But I want to emphasize a curious fact about function parameters (a.k.a. functio
In summary, function parameters are immutable in Zig.

Take the code example below, where we declare a simple function that just tries to add
some amount to the input integer, and returns the result back. But if you look closely
some amount to the input integer, and returns the result back. If you look closely
at the body of this `add2()` function, you will notice that we try
to save the result back into the `x` function argument.

Expand All @@ -616,8 +616,8 @@ into `x`. However, function arguments in Zig are immutable. You cannot change th
cannot assign values to them inside the body's function.

This is the reason why, the code example below do not compile successfully. If you try to compile
this code example, you get a compile error warning you that you are trying to change the value of a
immutable (i.e. constant) object.
this code example, you will get a compile error message about "trying to change the value of a
immutable (i.e. constant) object".

```{zig}
#| eval: false
Expand All @@ -640,6 +640,8 @@ t.zig:3:5: error: cannot assign to constant
```


### A free optimization

If a function argument receives as input an object whose data type is
any of the primitive types that we have listed in @sec-primitive-data-types,
this object is always passed by value to the function. In other words, this object
Expand All @@ -653,16 +655,20 @@ choose the strategy that is faster for you.
This optimization that you get for free is possible only because function arguments are
immutable in Zig.


### How to overcome this barrier

There are some situations where you might need to change the value of your function argument
directly inside the function's body. This happens more often when we are passing
C structs as inputs to Zig functions.

In a situation like this, you can overcome this barrier of immutable function arguments, by simply taking the lead,
and explicitly choosing to pass the object by reference to the function.
That is, instead of depending on the `zig` compiler to decide which strategy is best, you have
to explicitly mark the function argument as a pointer. This way, we are telling the compiler
that this function argument will be passed by reference to the function.
In a situation like this, you can overcome this barrier by using a pointer. In other words,
instead of passing a value as input to the argument, you can pass a "pointer to value" instead.
You can change the value that the pointer points to, by dereferencing it.

Therefore, if we take our previous `add2()` example, we can change the value of the
function argument `x` inside the function's body by marking the `x` argument as a
"pointer to a `u32` value" (i.e. `*u32` data type), instead of a `u32` value.
By making it a pointer, we can finally alter the value of this function argument directly inside
the body of the `add2()` function. You can see that the code example below compiles successfully.

Expand All @@ -687,6 +693,13 @@ Result: 6
```


Even in this code example above, the `x` argument is still immutable. Which means that the pointer itself is immutable.
Therefore, you cannot change the memory address that it points to. However, you can dereference the pointer
to access the value that it points to, and also, to change this value, if you need to.





## Structs and OOP {#sec-structs-and-oop}

Expand Down Expand Up @@ -988,7 +1001,7 @@ const Vec3 = struct {
return m.sqrt(xd + yd + zd);
}
pub fn double(self: *Vec3) void {
pub fn twice(self: *Vec3) void {
self.x = self.x * 2.0;
self.y = self.y * 2.0;
self.z = self.z * 2.0;
Expand All @@ -997,9 +1010,9 @@ const Vec3 = struct {
```

Notice in the code example above that we have added a new method
to our `Vec3` struct named `double()`. This method doubles the
to our `Vec3` struct named `twice()`. This method doubles the
coordinate values of our vector object. In the
case of the `double()` method, we annotated the `self` argument as `*Vec3`,
case of the `twice()` method, we annotated the `self` argument as `*Vec3`,
indicating that this argument receives a pointer (or a reference, if you prefer to call it this way)
to a `Vec3` object as input.

Expand All @@ -1008,7 +1021,7 @@ to a `Vec3` object as input.
var v3 = Vec3 {
.x = 4.2, .y = 2.4, .z = 0.9
};
v3.double();
v3.twice();
std.debug.print("Doubled: {d}\n", .{v3.x});
```

Expand All @@ -1018,15 +1031,15 @@ Doubled: 8.4



Now, if you change the `self` argument in this `double()` method to `self: Vec3`, like in the
Now, if you change the `self` argument in this `twice()` method to `self: Vec3`, like in the
`distance()` method, you will get the compiler error exposed below as result. Notice that this
error message is showing a line from the `double()` method body,
error message is showing a line from the `twice()` method body,
indicating that you cannot alter the value of the `x` data member.

```{zig}
#| eval: false
// If we change the function signature of double to:
pub fn double(self: Vec3) void {
pub fn twice(self: Vec3) void {
```

```
Expand Down
14 changes: 7 additions & 7 deletions Chapters/10-stack-project.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -68,22 +68,22 @@ function argument that is not known at compile time, the `zig` compiler will not
as a consequence, it will raise a compilation error saying that it cannot compile your program. Because
you are providing a value that is "runtime known" to a function argument that must be "compile-time known".

Take a look at this very simple example below, where we define a `double()` function, that simply
Take a look at this very simple example below, where we define a `twice()` function, that simply
doubles the input value named `num`. Notice that we use the `comptime` keyword before the name
of the function argument. This keyword is marking the function argument `num` as a "comptime argument".

That is a function argument whose value must be compile-time known. This is why the expression
`double(5678)` is valid, and no compilation errors are raised. Because the value `5678`
`twice(5678)` is valid, and no compilation errors are raised. Because the value `5678`
is compile-time known, so this is the expected behaviour for this function.

```{zig}
#| auto_main: false
#| build_type: "test"
fn double(comptime num: u32) u32 {
fn twice(comptime num: u32) u32 {
return num * 2;
}
test "test comptime" {
_ = double(5678);
_ = twice(5678);
}
```

Expand All @@ -92,14 +92,14 @@ For example, we might provide a different input value to this function depending
on the target OS of our compilation process. The code example below demonstrates such case.

Because the value of the object `n` is determined at runtime, we cannot provide this object
as input to the `double()` function. The `zig` compiler will not allow it, because we marked
as input to the `twice()` function. The `zig` compiler will not allow it, because we marked
the `num` argument as a "comptime argument". That is why the `zig` compiler raises
the compile-time error exposed below:

```{zig}
#| eval: false
const builtin = @import("builtin");
fn double(comptime num: u32) u32 {
fn twice(comptime num: u32) u32 {
return num * 2;
}
test "test comptime" {
Expand All @@ -109,7 +109,7 @@ test "test comptime" {
} else {
n = 5678;
}
_ = double(n);
_ = twice(n);
}
```

Expand Down
8 changes: 5 additions & 3 deletions _freeze/Chapters/01-memory/execute-results/html.json

Large diffs are not rendered by default.

8 changes: 5 additions & 3 deletions _freeze/Chapters/01-zig-weird/execute-results/html.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions _freeze/Chapters/03-structs/execute-results/html.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions _freeze/Chapters/10-stack-project/execute-results/html.json

Large diffs are not rendered by default.

Loading

0 comments on commit 491ad4a

Please sign in to comment.