Description
Background & Motivation
Some context:
start
: Avoid string concatenation in inline asm. #21056- add error for stack allocation in naked functions #72
- Return value of
inline
functions called from.Naked
functions cause stack allocations #21193 - Call functions in naked function #18183
As can be seen from these issues, we're having to come up with a growing list of language rules to make naked functions work reliably, and I worry that we're going to keep running into edge cases in both the language and compiler implementation that will need special handling for naked functions. These rules add extra complexity throughout Sema, and they're really just trying to contend with a rather simple reality: All compilers that implement GCC-style inline assembly and naked functions, including LLVM, consider asm
statements to be the only well-defined contents of naked functions. This is a reasonable stance for a compiler to take because, in the absence of an ABI-compliant prologue and epilogue, there are very few constructs that can be lowered correctly. It also makes a lot more sense if you think of naked functions as just being a convenient language feature for emitting a black box of machine instructions in function form, which is how GCC/Clang define them.
Rather than this growing list of language rules and the implementation complexity that come with them, I'd like to suggest what I think will be a simpler way of specifying and implementing naked functions. This proposal will make Zig's naked functions match the GCC/Clang definition and therefore also conform to LLVM's requirements.
(Note that some elements of this proposal are not unique to it; for example, the asm
restrictions that I describe below will have to be adopted in some form even if this particular proposal is rejected.)
Proposal
A function definition annotated with callconv(.Naked)
is known as a naked function. A naked function must have an empty parameter list and must have noreturn
as its return type. A naked function cannot be annotated with inline
, noinline
, or extern
. A naked function cannot be called directly, instead requiring a function type cast first, at which point a more detailed signature can be supplied. The compiler treats a naked function as a black box for the purposes of optimization and code generation; reachable calls to a naked function cannot be inlined, reordered, or elided.
The compiler will perform basic scaffolding for a naked function, such as defining the symbol in the resulting object file, but it will not emit any machine instructions in the function body - not even the usual prologue and epilogue. The programmer is expected to provide the implementation by way of inline assembly. By virtue of the noreturn
return type, it is considered safety-unchecked undefined behavior for control to reach the end of a naked function. (Note: This is unchecked because the panic handler cannot be invoked from a naked function. That said, compilers are encouraged to (and do) insert a single faulting instruction at the end of a naked function, making debugging a bit easier.)
A naked function has its body comptime
-evaluated during semantic analysis; that is, its body is implicitly a comptime { ... }
block. During this evaluation, asm
expressions are recorded rather than executed, and have some additional restrictions (see below). Besides this, all the usual comptime
rules apply. After comptime
evaluation is done, the function's body is fully replaced with the machine instructions resulting from the concatenation and assembly of the recorded asm
expressions, in lexical order. No further compiler transformations are performed on the function body. (Note: The semantics here are very similar to container-level comptime
blocks and the way asm
expressions are treated there.)
In a naked function, there are some additional restrictions for asm
expressions:
- The
volatile
annotation is not permitted. (Note:asm volatile
is a meaningless concept in naked functions because of their black box nature.) - Inputs are allowed, but the operands must evaluate to
comptime
-known values.- The only permissible input constraints are those which do not require emitting extra machine instructions to pass the input operand into the assembly block. The list of these is inherently target-specific, but usually includes
X
ands
. - A compiler backend may place further target-specific restrictions on the types of input operands.
- Most targets will require an input operand to be at most pointer-sized.
- Some targets cannot generally accept a floating-point input operand because it would require emitting extra machine instructions.
- The only permissible input constraints are those which do not require emitting extra machine instructions to pass the input operand into the assembly block. The list of these is inherently target-specific, but usually includes
- Outputs and clobbers are not allowed. (Note: This is because naked functions are already assumed to possibly do anything. This rule also implies that
asm
expressions in naked functions have no meaningful result value.)
Open Questions
- If we're concerned about the implicit
comptime
-ness of the function body as a result of just thecallconv(.Naked)
syntax, to make the semantics explicit, we could require that naked functions always contain just a singlecomptime
block. - It's worth considering
noinline
as a required annotation on naked functions to make explicit the fact that the compiler is not allowed to inline naked functions. - I go back and forth on
asm volatile
in naked functions. It's easy to argue thatvolatile
should be required to make explicit the fact thatasm
expressions in a naked function can never be dropped by the compiler. On the other hand, forbiddingvolatile
is consistent with container-levelasm
. I don't feel particularly strongly about either approach; I just think we should either require or forbidvolatile
, rather than status quo where we apply theasm volatile
rules that are used in regular function context.