Skip to content

Windows performance in Release profile seems crippled when building dev Cargo profile (RelWithDebInfo is faster) #649

Closed
@vlovich

Description

@vlovich

I'm not sure if this is something specific to my configuration, but I'm seeing this across the board in CI & local machine: the Release variant compiler flags are missing O2 (& DNDEBUG). I don't think this has anything to do with this project per se as I suspect the bug lies in the cmake crate, but wanted to flag. This results in CPU inference being >2x slower on my AMD with performance equivalent to debug llama-cli. Switching to RelWithDebInfo speeds it up quite a bit.

//Flags used by the CXX compiler during RELEASE builds.
CMAKE_CXX_FLAGS_RELEASE:STRING= -nologo -MD -Brepro

//Flags used by the CXX compiler during RELWITHDEBINFO builds.
CMAKE_CXX_FLAGS_RELWITHDEBINFO:STRING=/MD /Zi /O2 /Ob1 /DNDEBUG

//Flags used by the C compiler during RELEASE builds.
CMAKE_C_FLAGS_RELEASE:STRING= -nologo -MD -Brepro

//Flags used by the C compiler during RELWITHDEBINFO builds.
CMAKE_C_FLAGS_RELWITHDEBINFO:STRING=/MD /Zi /O2 /Ob1 /DNDEBUG

It's also worth noting that stock cmake configures /Ob2 as well for Release but I didn't observe any meaningful performance difference from that.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions