Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LMBench lat_mem_rd on qtrvsim #153

Open
arav12ind opened this issue Dec 8, 2024 · 3 comments
Open

LMBench lat_mem_rd on qtrvsim #153

arav12ind opened this issue Dec 8, 2024 · 3 comments

Comments

@arav12ind
Copy link

I have been wanting to compile LMBench for qtrvsim and do some experiments with lat_mem_rd. From what I noticed, printf isn't working with qtrvsim. I am also not sure how runtime will get measured in qtrvsim or if it is even possible. Can someone clarify if it is possible to get lat_mem_rd working without having to write my own standard library functions for qtrvsim?

@ppisa
Copy link
Member

ppisa commented Dec 9, 2024

It would help if you specify which compiler and libraries you attempt to use.

The QtRvSim list of supported system-calls is very limited (ftruncate, openat, close, read, write, readv, writev, exit, brk and limited mmap suitable for anonymous memory allocation). The system calls are compatible with Linux kernel codes and system-calls calling convention. But this is enough to compile simple binaries which can run on QtRvSim as well as on 32-bit or 64-bit native GNU/Linux system or under user-space QEMU emulation. But you cannot use full featured Glibc, it requires lot more functionality of operating system support.

But when NewLib or some of its clones is used together with embedded GCC (i.e. riscv64-unknown-elf tools available on Debian as gcc-riscv64-unknown-elf package) then you can use it to compile binaries compatible with Linux kernle and QtRvSim. See the example which uses printf() and malloc()

https://gitlab.fel.cvut.cz/b35apo/stud-support/-/tree/master/seminaries/qtrvsim/os-emu-example

and malloc-test.c is standard Unix model single thread C program starting from main().

The mapping between NewLib functions and Linux kernel system calls is provided in the file qtrvsim_sys_stub.c. The environment needs C runtime startup file. Such minimal startup is provided in the file crt0local.S. The critical is to setup correctly gp register which is required to reach global variables linked in .data and .bss sections. Then the area for the stack and heap is allocated. It is space between _heap_stack_start and _heap_stack_end symbols. It is only 16 kB in example but can be much much larger.

By default open() and close() are mapped same as file descriptors 0, 1 and 2 to the terminal. But when you specify directory representing filesystem root for QtRvSim Linux kernel "emulation" then files under this directory can be created, opened, read and written as well as truncated to allow rewrite.

So simple POSIX based programs can be compiled and run on QtRvSim and they are real Linux ELF binaries as well.

Alternative option is to rewrite qtrvsim_sys_stub.c to implement read(), write(), readv() and writev() as operation accessing directly serial port over it registers and then NewLib can be used again to implement prinf(), scanf() and the rest of stream IO operations above this base.

The system-calls support can be extended, pull request with portable implementation are welcomed. But they should map to the respective Linux kernel definition. It can be functionality subset. But I do not want to create another new incompatible, proprietary interface as RARS and other simulators often define. Code which is limited to user priviledge instructions and does not access hardware directly should run or real RISC-V Linux systems as well.

@ppisa
Copy link
Member

ppisa commented Dec 9, 2024

As for the time measuremet, clock_gettime() syscall worth to be implemented. The way how you can use to access some time related information in the current QtRvSim version is read of the cycle or mcycle registers or standard defined MTIMER peripheral. But there is problem that if you want to check for latency caused by memory accesses accesses causing the cache miss then the cycle does not count that delay, we do not want to make GUI requiring to wait for multiple cycles to process single instruction so the memory access delay is accounted only n the respective cache statistics but does not cause pipeline stalls and additional cycles. But you can see the statistics and give cache speedup in its dialog and for some more automated testing you can use qtrvsim_cli command line version which allows to enable and configure cache and then grab statistics at exit on exit system call or ebreak isntruction. See the exercises https://cw.fel.cvut.cz/wiki/courses/b35apo/en/tutorials/04/start , https://cw.fel.cvut.cz/wiki/courses/b35apo/en/tutorials/05/start and bonus task Optimization of Code and Cache Organization https://cw.fel.cvut.cz/wiki/courses/b35apo/en/homeworks/bonus/start . The template for the bonus is available there https://gitlab.fel.cvut.cz/b35apo/stud-support/-/tree/master/seminaries/qtrvsim/apo-sort and even online evaluation/competition can be run there https://eval.comparch.edu.cvut.cz/task/4 .

So try to specify what is your goal and I can provide idea what can be done with actual QtRvSim offered features or where to start to implement required support in QtRvSim code-base.

@arav12ind
Copy link
Author

arav12ind commented Dec 13, 2024

I used Sifive's toolchain .
I was tasked with by my teacher to look for some demonstration of 'real' benchmark, that showcases come aspects of cache, in a GUI simulator for students . I felt LMBench's lat_mem_read is the best benchmark for it.
Thank you for your suggestions, didn't get time to go through them though. I will take a look this weekend and get back.
Sorry for the late reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants