Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance + VRAM] Re-implement 16-byte Vertex Format + use 4 Byte alignment #487

Draft
wants to merge 4 commits into
base: 1.21
Choose a base branch
from

Conversation

thr3343
Copy link
Contributor

@thr3343 thr3343 commented Aug 16, 2024

The 16-Byte vertex format for terrain was planned in earlier versions of VulkanMod, but was scrapped due to causing performance regressions on AMD's GCN architecture

This patch exploits a workaround by doubling the vertex byte alignment to 4 bytes instead of 2, which fixes the regression and allows the 16-byte format to be used on GCN at full performance.

The 16-Byte vertex format provides the following advantages over the current 20-byte format

  • VRAM usage reduced by 20%
  • Reduced bandwidth usage when loading/updating Chunks
  • FPS + Performance improvements on some hardware (e.g. Nvidia Turing or later)

@thr3343
Copy link
Contributor Author

thr3343 commented Aug 28, 2024

This doesn't seem to improve performance on Nvidia RTX 2000+, while older tests with alternate versions of the 16-Byte format did

I suspect this is due to alignment again
where perhaps Nvidia requires a smaller 2 byte alignment for full performance, unlike GCN which requires 4 bytes

//2 Byte aligned: At least 10%+ FPS on Nvidia Turing+, but has Performance regressions on GCN
layout (location = 0) in ivec4 Position;
layout (location = 1) in vec4 Color;
layout (location = 2) in uvec2 UV0;

//4 Byte aligned: Slower due to not hitting the fast FP16 path on Nvidia RTX 2000+
layout (location = 0) in ivec4 Position;
layout (location = 1) in vec4 Color;
layout (location = 2) in uint UV0;

This is problematic as this PR should improve performance on Nvidia not decrease it

So will mark as draft until this missing performance uplift on Nvidia is fixed

@thr3343 thr3343 marked this pull request as draft August 28, 2024 18:47
Select 2 Byte alignment by default (including Nvidia), otherwise use 4 Bytes on AMD (GCN)
@thr3343 thr3343 changed the base branch from 1.20.x to 1.21 October 5, 2024 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant