Skip to content

Math support in core #2505

Closed
Closed
@japaric

Description

@japaric

Background

Currently the core crate doesn't provide support for mathematical functions like sqrt or sin.
To do math in a #![no_std] program one has the following options:

  • Link to a C implementation of libm, i.e. libm.a. This is cumbersome as the programmer needs to
    obtain a compiled version of libm for their target, or compile libm themselves which implies a C
    cross toolchain when the target system and the build system are not the same architecture / OS.

  • Use a pure Rust implementation of libm, like the libm crate. On stable, (a) the performance of
    such implementation won't be on par with a C implementation, or (b) to achieve the same
    performance the user would require a C (cross) toolchain.

To elaborate on (a) and (b). Consider the following contrived program that computes the square root
of a number:

#![no_std]

extern crate libm;

use core::ptr;

use libm::F32Ext;

#[no_mangle]
pub unsafe fn foo() {
    // volatile memory accesses to prevent the compiler from optimizing away everything
    let x: f32 = ptr::read_volatile(0x2000_0000 as *const _);
    let y = x.sqrt();
    ptr::write_volatile(0x2000_1000 as *mut _, y);
}

When compiled for the thumbv7em-none-eabihf target it produces the following machine code:

00000000 <foo>:
   0:   f04f 5000       mov.w   r0, #536870912  ; 0x20000000
   4:   ed90 0a00       vldr    s0, [r0]
   8:   ee10 1a10       vmov    r1, s0
   c:   f001 40ff       and.w   r0, r1, #2139095040     ; 0x7f800000
  10:   f1b0 4fff       cmp.w   r0, #2139095040 ; 0x7f800000
  14:   d108            bne.n   28 <foo+0x28>
  16:   ee00 0a00       vmla.f32        s0, s0, s0
  (..)
 2f4:   ed80 0a00       vstr    s0, [r0]
 2f8:   4770            bx      lr

This is extremely inefficient machine code because the target has a hardware FPU that supports
computing the square root in a single instruction. Ideally, the program should compile down to the
following machine code:

00000000 <foo>:
   0:   f04f 5000       mov.w   r0, #536870912  ; 0x20000000
   4:   ed90 0a00       vldr    s0, [r0]
   8:   f241 0000       movw    r0, #4096       ; 0x1000
   c:   f2c2 0000       movt    r0, #8192       ; 0x2000
  10:   eeb1 0ac0       vsqrt.f32       s0, s0
  14:   ed80 0a00       vstr    s0, [r0]
  18:   4770            bx      lr

If the target had access to the standard library the program would compile down to that machine code
because the implementation of f32.sqrt in std looks like this:

#![feature(core_intrinsics)]

use std::intrinsics;

impl f32 {
    fn sqrt(self) -> Self {
        intrinsics::sqrtf32(self)
    }
}

sqrtf32 is an unstable, thin wrapper around an LLVM intrinsic that either compiles down to a
hardware implementation of square root if the target architecture supports it in its instruction
set, or it produces a call to the sqrtf routine if it doesn't (*). std makes use of 30+ of such
LLVM intrinsics for performance of math functions.

(*) The llvm.sqrt.* LLVM intrinsic, which sqrtf32 wraps, is not quite specified like that but
that's the observable effect.

The libm crate can't make use of this intrinsic on stable because it's unstable and feature
gated. However, the libm crate could replicate the behavior of the sqrtf32 intrinsic using
conditional compilation and external assembly files as shown below:

// crate: libm

// NOTE heavily simplified because it ignores architectures other than ARM
impl F32Ext for f32 {
    #[cfg(target_arch = "arm")]
    fn sqrt(self) -> Self {
        extern "C" {
            // provided by an external assembly file
            fn vsqrt_f32(x: f32) -> f32;
        }

        unsafe { vsqrt_f32(self) }
    }

    #[cfg(not(target_arch = "arm"))]
    fn sqrt(self) -> Self {
        // software implementation
    }
}

But this would heavily complicate the implementation of the libm crate, which would likely
introduce bugs. Also, as it's not possible to use inline assembly (asm!) on stable the vsqrt.f32
instruction would have to be invoked via FFI and an external assembly file. External assembly files
mean that the user would require a C (cross) toolchain to build the crate negating the main benefit
of using a pure Rust implementation of libm.

Possible solutions

I see two options for improving the situation here:

a. We stabilize the family of sqrtf32 LLVM intrinsics. This way crates like libm can achieve the
performance of the std implementation on stable without requiring complex conditional
compilation and C toolchains. Or,

b. We move all the existing math support from std to core. For the user this means that e.g.
f32.sqrt will also work in #![no_std] programs.

Option (a) is kind of bad (maybe?) for alternative backends like cranelift as they would have to
support / implement these LLVM intrinsics to be on parity with the rustc+LLVM compiler.

Option (b) requires us (*) to provide an implementation of math functions (symbols) like sqrtf
for targets that do not link to libm by default. If we don't do this those targets will hit
"undefined reference to sqrtf" linker errors when using math methods like f32.sqrt.

(*) "us" as in: we must provide symbols like sqrtf in the compiler-builtins crate. Note that we
are already providing such symbols for the wasm32-unknown-unknown target, and we are
using the libm crate to do that.

If we go ahead with option (b) we must be careful to not provide the math symbols in
compiler-builtins for targets that are currently using system libm (e.g.
x86_64-unknown-linux-gnu). Because if we do provide the symbols then all existing programs will
start using the libm crate implementation instead of the system libm implementation -- this is due
to how we invoke the linker: libcompiler_builtins.rlib appears before -lm in the linker
arguments -- and that may degrade performance in some cases where system libm has architecture
optimized implementations of some functions.

With option (b) I believe that #![no_std] programs that are currently linking to some C
implementation of libm for math support will end up using the libm crate implementation as a side
effect. I don't see a way to avoid this: even if we mark the math symbols in compiler-builtins as
weak the way we invoke the linker will cause the program to use the libm crate implementation.

Final thoughts

IMO, math support should be in the core crate as it doesn't depend on OS, or I/O, abstractions
like other std-only API does (e.g. std::fs, std::net). Also, std makes math like sqrt feel built-in because the functionality is provided as inherent methods -- it feels weird that such "built-in" functionality is not available in #![no_std].


Thoughts? Should we do (a) or (b)? Or is there some other solution? Or should we leave math out of core?

cc @SimonSapin (T-libs), @jethrogb @Ericson2314 (T-portability), @joshtriplett @korken89 (some stakeholders)

Metadata

Metadata

Assignees

No one assigned

    Labels

    T-libs-apiRelevant to the library API team, which will review and decide on the RFC.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions