-
-
Notifications
You must be signed in to change notification settings - Fork 604
OSv Linux ABI Compatibility
OSv is mostly implements Linux's ABI. This means that most unmodified executable code compiled for Linux can be run in OSv. There are only a few areas where OSv is known to be imperfectly compatible with Linux. This document will list these cases.
OSv supports only a single process. Therefore, fork(), vfork() and clone() are not supported (their use in an executable will cause a crash because of a missing symbol).
Moreover, in OSv there is no isolation between the single process and the kernel - we do not track which memory, and which resources (threads, mutexes, etc.) belong to the process and which to the kernel. So exec() in an attempt to switch only the process but not the kernel, is not supported. So all exec() variants - execl(), execlp(), execle(), execv(), execvp(), execvpe(), and execve() - are not supported. Instead, if you want to run another executable, you can load it as a new shared object (with dlopen() or equivalent) and run its main() (see also osv::run(3o)), and attempt to free the old shared object's resources, and to unload it.
OSv is designed to support a single application on a VM, and with only a single application trying to isolate different users is pointless. Therefore, OSv only supports a single user - with uid=0 and gid=0. Trying to set a different user will fail. Permission bits on files are ignored (TODO: only the owner/group/other difference, or also the writable bit?)
In Linux, when a signal is sent to a process with kill(), it is delivered to one of this process's threads which hasn't masked this signal - preferably to the main thread but if it masked the signal, then one of the other threads is chosen.
In contrast, in OSv signals are delivered in a new thread, not one of the process's existing threads. The signal handler does not preempt an existing thread, cannot take over it (as single-thread applications sometimes used longjmp()), and the signal delivery cannot interrupt a long-running system call (such as sleep() or read()) in a thread.
It is hoped that this difference will not matter to most actual uses of kill() (and related functions, such as alarm()) in cloud applications. The reason why signal handling was implemented differently in OSv is system call reentrancy and interruption: If a signal handler was to be run in an existing thread, we would need to handle the case where the signal handler is run in the middle of a system call, and the handler also calls a system call - and the system call might not be reentrant. It is possible to track calls to "system calls" (calls from a shared object to the main program), and avoid running a signal handler while a system call is in progress (and run it when it returns), but then we have a problem of system call interruption - if we are sleeping on "system calls" like sleep(), poll(), read(), etc., or internally on OSv functions like mutex_wait(), condvar_wait(), msleep(), the signal handler can be delayed indefinitely, so in Unix the signal first interrupts the system call, which needs to unravel its stack and return an EINTR from the system call. All of these are doable, but will require extensive changes to the code, and make it slower and uglier just to support an archaic Unix API, kill().
Linux /proc/* files are not yet supported on OSV. Neither is /dev/ramdom and /dev/urandom.