diff --git a/content/blog/2024-08-22-gsoc-multiboot2.mdx b/content/blog/2024-08-22-gsoc-multiboot2.mdx new file mode 100644 index 00000000..a49bda17 --- /dev/null +++ b/content/blog/2024-08-22-gsoc-multiboot2.mdx @@ -0,0 +1,111 @@ +--- +title: "Multiboot2 Support in Unikraft" +description: | + In this blog post, I discuss the final weeks of my GSoC project, highlighting the process of testing Nuktiboot2 support on the helloworld and nginx applications, and reflecting on the progress made in enhancing the boot process. +publishedDate: 2024-08-22 +image: /images/unikraft-gsoc24.png +tags: +- gsoc +- gsoc24 +- multiboot2 +- booting +authors: +- Maria Pana +--- + +The previous blog post concluded with testing the Multiboot2 support on the `helloworld` application. +While the initial exploration was promising, a roadblock was soon encountered. +This time, I will delve deeper into the challenges I faced and the solutions implemented to overcome them. + +## The `helloworld` Odyssey Comes to an End + +The `helloworld` application was failing due to an incorrect memory region allocation for the command line (`cmdline`) argument. +The issue seems to have had its roots in the way GRUB was placing the command line arguments between the code segmnent and the read-only data segment. +This placement was inconsistent with the expected memory layout: + +```log +000000100000-00000012f000 000000100000-00000012f000 rw- 0000000000100000 +00000012f000-000000130000 00000012f2d8-00000012f2f8 r-- 000000000012f000 cmdl <-- +000000100000-000000130000 000000100000-000000130000 r-x 0000000000100000 krnl +000000130000-000000142000 000000130000-000000142000 r-- 0000000000130000 krnl +``` + +Fortunately, my mentor, Sergiu Moga, came to the rescue. +According to him, a currently open PR proposes that such cases will be handled by having two separate commandline buffers: one mutable and one immutable. +The first is used by Unikraft to parse and modify arguments, while the second contains what GRUB provides. +Therefore, we now coalesce all memory regions except the `cmdl`, which is afterwards allocated separately and populated with the original command line arguments. +This way, the memory region for the commandline is placed at the start of the boot memory map, without interfering with the rest of the memory regions: + +```log +000000001000-000000002000 000000001000-000000001020 r-- 0000000000001000 cmdl <-- +000000002000-000000003000 000000002000-00000000200b rw- 0000000000002000 krnl +000000003000-000000013000 000000003000-000000013000 rw- 0000000000003000 stck +000000013000-0000000a0000 000000013000-0000000a0000 rw- 0000000000013000 +0000000a0000-0000000c0000 0000000a0000-0000000c0000 rw- 00000000000a0000 rsvd +0000000e0000-000000100000 0000000e0000-000000100000 r-- 00000000000e0000 rsvd +000000100000-000000130000 000000100000-000000130000 r-x 0000000000100000 krnl +000000130000-000000142000 000000130000-000000142000 r-- 0000000000130000 krnl +000000142000-000000143000 000000142000-000000143000 rw- 0000000000142000 krnl +000000143000-00000017f000 000000143000-00000017f000 rw- 0000000000143000 krnl +00000017f000-0000bffe0000 00000017f000-0000bffe0000 rw- 000000000017f000 +0000bffe0000-0000c0000000 0000bffe0000-0000c0000000 r-- 00000000bffe0000 rsvd +0000feffc000-0000ff000000 0000feffc000-0000ff000000 r-- 00000000feffc000 rsvd +0000fffc0000-000100000000 0000fffc0000-000100000000 r-- 00000000fffc0000 rsvd +``` + +With this change, the `helloworld` application now works as expected🎉! +I probably was never as happy to see a simple "Hello, World!" +message printed on the screen as I was then. + +> see a **SIMPLE** "Hello, World!" message printed on the screen + +Simple, huh? +Yeah, only the `helloworld` application working is not going to cut it. + +While `helloworld` provided a valuable initial test, the app doesn't actually use all the implemented Multiboot2 tags (i.e. the handling of the `MULTIBOOT2_TAG_TYPE_MODULE` tag had yet to be tested). +Therefore, the next step was to "behead" a more complex application. + +If you are seeing this, you are missing out on a VERY funny meme about testing Multiboot2 + +## nginx: The Next Frontier + +To see if the `initrd` module is correctly handled through the `MULTIBOOT2_TAG_TYPE_MODULE` tag, I got to testing on `nginx`. +Getting the `nginx` testing setup to be compatible with Multiboot2 was a bit of a challenge since QEMU doesn't natively support Multiboot2. +Consequently, options such as `-kernel` and `-append` used in the usual running script are not applicable. +Instead, I tested on the `.iso` file generated by [`mkukimg`](https://github.com/unikraft/unikraft/blob/staging/support/scripts/mkukimg) while passing the command line arguments through the script's `-c` option. +For a clearer understanding, you can check this sample [testing script](https://gist.github.com/mariapana/90ab2e61715e812def0c7582c9482626). + +With the QEMU issue out of the way, I ran into a perplexing error: `Truncated CPIO header`. +This seemed odd, as the `.cpio` archive for `initrd` was generated correctly. +Upon further investigation, it is once again I found myself digging into the boot memory map. +Computing the size for the memory region of the `initrd` module proved to the culprit, leaving too little space for the `initrd` module to be loaded. + +A simple adjustment to the memory region size calculation fixed the issue, and the `nginx` application was up and running with Multiboot2 in no time! + +## The Finishing Touches: Polishing code and Preparing for Review + +With both the `helloworld` and `nginx` applications working with Multiboot2, it was time to refine my implementation. + +The main focus was refactoring the [`multiboot2.py`](https://github.com/mariapana/unikraft/blob/multiboot2/support/scripts/multiboot2.py) script to simplify future tag additions and make it more maintainable. +Taking inspiration from [GRUB's approach](https://elixir.bootlin.com/grub/grub-2.06/source/grub-core/loader/multiboot_mbi2.c#L152), I essentially created a "carousel" of tags. +This way, every tag would have its dedicated space where the implementation would be added, eveything being performed through only one all mighty function. +At the moment, just two tags are implemented and they are both very similiar in structure, so this idea seemed harmless. +However, as the number of tags grows, this approach might become cumbersome, with the function becoming too large. +This is the reason why I opted for an enhanced version of the original plan, where each tag has its own function. + +The [`mkukimg`](https://github.com/unikraft/unikraft/blob/staging/support/scripts/mkukimg) script received an update to support both Multiboot and Multiboot2, introducing separate `grubmb` and `grubmb2` options for the bootloader. +Finally, coding style and commit history are being polished in preparation for a well-structured pull request submission. + +## ~~Next steps~~ The End? + +As the Google Summer of Code program is coming to an end, I can't help but feel a bit nostalgic. +The past three months have been a rollercoaster of emotions, from the frustration of hitting roadblocks to then the even greater satisfaction of surpassing them. +I am grateful for the opportunity to work on this project and for my mentors who guided me and supported me throughout this process. +Somehow they managed to keep my sanity intact even when I was ready to throw my computer out the window (metaphorically speaking, of course). + +That being said, the journey doesn't end here. +Even with Multiboot2 being functional now, there are still plenty of ways to go beyond. +So, (un)fortunately, this is not the last you'll hear from me. + +Until then, thank you for coming along the journey! +Toodles!👋