Skip to content


Add sample for dynamic rendering local read (#887)
Browse files Browse the repository at this point in the history
* Updated Vulkan headers tp 1.3.276

* Only destroy render pass if != null

Required for samples that don't use render passes

* Added support for loading colors from glTF files

Initialize base color
Removed assert that would trigger on some basic glTFs
Refs #848

* Added new sample for dynamic rendering local read

Copy from internal git repository

* Added dynamic rendering local read sample to readme and docs navigation

* Update copyright
Fix clang format

* Adjusted memory barrier

* Clean up api sample base class interfaces

* Revert "Clean up api sample base class interfaces"

This reverts commit 7504395.

* Adjust to recent framework changes

* Clang format

* Disable GUI for dynamic rendering
Would require fixes to the framework

* Tring to fix things

* Correct merge error

* Added missing break

* Fix documentation

* Minor cleanup

* Desperately trying to fix clang format

* Trying to fix CI

* Trying to fix CI

* Clang format
IMO not correct, but the only way to get that build step running

* Added read-only flag to lights buffer
  • Loading branch information
SaschaWillems authored Sep 6, 2024
1 parent 46a89dd commit d0fda1b
Show file tree
Hide file tree
Showing 18 changed files with 1,660 additions and 7 deletions.
1 change: 1 addition & 0 deletions antora/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@
** xref:samples/extensions/dynamic_line_rasterization/README.adoc[Dynamic line rasterization]
** xref:samples/extensions/dynamic_primitive_clipping/README.adoc[Dynamic primitive clipping]
** xref:samples/extensions/dynamic_rendering/README.adoc[Dynamic rendering]
** xref:samples/extensions/dynamic_rendering_local_read/README.adoc[Dynamic rendering local read]
** xref:samples/extensions/extended_dynamic_state2/README.adoc[Extended dynamic state2]
** xref:samples/extensions/fragment_shader_barycentric/README.adoc[Fragment shader barycentric]
** xref:samples/extensions/fragment_shading_rate/README.adoc[Fragment shading rate]
Expand Down
5 changes: 4 additions & 1 deletion framework/api_vulkan_sample.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -593,7 +593,10 @@ ApiVulkanSample::~ApiVulkanSample()
vkDestroyDescriptorPool(get_device().get_handle(), descriptor_pool, nullptr);
vkDestroyRenderPass(get_device().get_handle(), render_pass, nullptr);
if (render_pass != VK_NULL_HANDLE)
vkDestroyRenderPass(get_device().get_handle(), render_pass, nullptr);
for (uint32_t i = 0; i < framebuffers.size(); i++)
vkDestroyFramebuffer(get_device().get_handle(), framebuffers[i], nullptr);
Expand Down
1 change: 1 addition & 0 deletions framework/api_vulkan_sample.h
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ struct Vertex
glm::vec2 uv;
glm::vec4 joint0;
glm::vec4 weight0;
glm::vec3 color;

Expand Down
38 changes: 35 additions & 3 deletions framework/gltf_loader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -431,7 +431,6 @@ std::unique_ptr<sg::Scene> GLTFLoader::read_scene_from_file(const std::string &f
if (!err.empty())
LOGE("Error loading gltf model: {}.", err.c_str());

return nullptr;

Expand Down Expand Up @@ -1056,7 +1055,12 @@ sg::Scene GLTFLoader::load_scene(int scene_index, VkBufferUsageFlags additional_
auto node_it = traverse_nodes.front();

assert(node_it.second < nodes.size());
// @todo: this crashes on some very basic scenes
// assert(node_it.second < nodes.size());
if (node_it.second >= nodes.size())
auto &current_node = *nodes[node_it.second];
auto &traverse_root_node = node_it.first;

Expand Down Expand Up @@ -1123,6 +1127,8 @@ std::unique_ptr<sg::SubMesh> GLTFLoader::load_model(uint32_t index, bool storage
const float *uvs = nullptr;
const uint16_t *joints = nullptr;
const float *weights = nullptr;
const float *colors = nullptr;
uint32_t color_component_count{4};

// Position attribute is required
auto &accessor = model.accessors[gltf_primitive.attributes.find("POSITION")->second];
Expand All @@ -1146,6 +1152,14 @@ std::unique_ptr<sg::SubMesh> GLTFLoader::load_model(uint32_t index, bool storage
uvs = reinterpret_cast<const float *>(&(model.buffers[buffer_view.buffer].data[accessor.byteOffset + buffer_view.byteOffset]));

if (gltf_primitive.attributes.find("COLOR_0") != gltf_primitive.attributes.end())
accessor = model.accessors[gltf_primitive.attributes.find("COLOR_0")->second];
buffer_view = model.bufferViews[accessor.bufferView];
colors = reinterpret_cast<const float *>(&(model.buffers[buffer_view.buffer].data[accessor.byteOffset + buffer_view.byteOffset]));
color_component_count = accessor.type == TINYGLTF_PARAMETER_TYPE_FLOAT_VEC3 ? 3 : 4;

// Skinning
// Joints
if (gltf_primitive.attributes.find("JOINTS_0") != gltf_primitive.attributes.end())
Expand Down Expand Up @@ -1196,7 +1210,22 @@ std::unique_ptr<sg::SubMesh> GLTFLoader::load_model(uint32_t index, bool storage
vert.pos = glm::vec4(glm::make_vec3(&pos[v * 3]), 1.0f);
vert.normal = glm::normalize(glm::vec3(normals ? glm::make_vec3(&normals[v * 3]) : glm::vec3(0.0f)));
vert.uv = uvs ? glm::make_vec2(&uvs[v * 2]) : glm::vec3(0.0f);

if (colors)
switch (color_component_count)
case 3:
vert.color = glm::vec4(glm::make_vec3(&colors[v * 3]), 1.0f);
case 4:
vert.color = glm::make_vec4(&colors[v * 4]);
vert.color = glm::vec4(1.0f);
vert.joint0 = has_skin ? glm::vec4(glm::make_vec4(&joints[v * 4])) : glm::vec4(0.0f);
vert.weight0 = has_skin ? glm::make_vec4(&weights[v * 4]) : glm::vec4(0.0f);
Expand Down Expand Up @@ -1373,6 +1402,9 @@ std::unique_ptr<sg::PBRMaterial> GLTFLoader::parse_material(const tinygltf::Mate
auto material = std::make_unique<sg::PBRMaterial>(;

// Initialize base color to 1.0f as per glTF spec
material->base_color_factor = glm::vec4(1.0f);

for (auto &gltf_value : gltf_material.values)
if (gltf_value.first == "baseColorFactor")
Expand Down
3 changes: 0 additions & 3 deletions samples/README.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,3 @@ include::./extensions/README.adoc[]


7 changes: 7 additions & 0 deletions samples/extensions/README.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,13 @@ Enables overestimation to generate fragments for every pixel touched instead of
Demonstrates how to use Dynamic Rendering.
Read the blog post here for discussion: (

=== xref:./extensions/dynamic_rendering_local_read/README.adoc[Dynamic Rendering local read]


Demonstrates how to use Dynamic Rendering with local reads to fully replace render passses with multiple subpasses.
See this[this blogpost].

=== xref:./{extension_samplespath}push_descriptors/README.adoc[Push Descriptors]

Expand Down
34 changes: 34 additions & 0 deletions samples/extensions/dynamic_rendering_local_read/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Copyright (c) 2024, Sascha Willems
# SPDX-License-Identifier: Apache-2.0
# Licensed under the Apache License, Version 2.0 the "License";
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# See the License for the specific language governing permissions and
# limitations under the License.

get_filename_component(FOLDER_NAME ${CMAKE_CURRENT_LIST_DIR} NAME)
get_filename_component(PARENT_DIR ${CMAKE_CURRENT_LIST_DIR} PATH)
get_filename_component(CATEGORY_NAME ${PARENT_DIR} NAME)

AUTHOR "Sascha Willems"
NAME "Dynamic Rendering local reads"
DESCRIPTION "Demonstrates the dynamic rendering local read extension to use input attachments with dynamic rendering"
121 changes: 121 additions & 0 deletions samples/extensions/dynamic_rendering_local_read/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
- Copyright (c) 2024, Sascha Willems
- SPDX-License-Identifier: Apache-2.0
- Licensed under the Apache License, Version 2.0 the "License";
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- See the License for the specific language governing permissions and
- limitations under the License.
= Dynamic Rendering local read
TIP: The source for this sample can be found in the[Khronos Vulkan samples github repository].


== Overview

This sample demonstrates how to use the `VK_KHR_dynamic_rendering_local_read` extension in conjunction with the `VK_KHR_dynamic_rendering` extension. This combination can replace core render and subpasses, making it possible to do local reads via input attachments with dynamic rendering.

== Toggling between dynamic rendering and renderpasses

To make it easy to compare the two different approaches of using either dynamic rendering + local reads or renderpasses + subpasses, this sample has code for both rendering paths.

A define in `dynamic_rendering_local_read.h` can be used to toggle between the two techniques:


This is enabled by default, making the sample use dynamic rendering with local reads. If you want to use renderpass + subpasses instead, comment this define out and compile the sample.

== Comparison

For a primer on the differences between renderpasses and dynamic rendering, see the readme of the xref:../dynamic_rendering/README.adoc[dynamic rendering sample].

Here is the comparison table from that example extended with the newly added features from `VK_KHR_dynamic_rendering_local_read` in *bold*:

| Vulkan 1.0 | Dynamic Rendering

| Rendering begins with `vkCmdBeginRenderPass`
| Rendering begins with `vkCmdBeginRenderingKHR`

| Rendering struct is `VkRenderPassBeginInfo`
| Rendering struct is `VkRenderingInfoKHR`

| Attachments are referenced by `VkFramebuffer`
| Attachments are referenced by `VkRenderingAttachmentInfoKHR`

| `VkFramebuffer` objects are heap-allocated and opaque
| `VkRenderingAttachmentInfoKHR` objects are stack-allocated

| Graphics pipeline creation references a `VkRenderPass`
| Graphics pipeline creation references a `VkPipelineRenderingCreateInfoKHR`

| *Subpasses are advanced with `vkCmdNextSubpass`*
| *`VkImageMemoryBarrier` to `VK_IMAGE_LAYOUT_RENDERING_LOCAL_READ_KHR` image layout*

| *Local reads in shaders use `subpassLoad`*
| *Local reads in shaders use `subpassLoad`*

== The sample

With subpasses it's possible to do pixel local reads within a single renderpass. Local read means that you can't freely sample (like with a texture + sampler) but are instead limited to reading the pixel value from the previous subpass at the exact same position. This is based on how esp. tile based GPU architectures work. On such architectures workloads that don't need to sample arbitrarily can improve performance using subpasses and pixel local reads using input attachments. One such example is a deferred renderer with a composition pass. First multiple attachments are filled with different information (albedo, normals, world space position) and then at a later point those attachments are combined into a single image. This composition step reads those attachments at the exact same position that the current pass is operating on, so instead of sampling from these we can use them as input attachments instead and do only pixel local reads.

The rendering setup for this sample looks like this:

image::./images/deferred_setup.png[Deferred setup describing subpasses]

A big criticism with renderpasses was how involved esp. the setup is. Getting renderpasses and subpasses incl. dependencies correct can be tricky and renderpasses are kinda hard to integrate into a dynamically changing setup, making them a hard fit for complex Vulkan projects like game engines. With dynamic rendering, setup is far less involved and moves mostly to command buffer creation. If you look at the sample you can easily spot how much code required by looking at the parts that are deactivated via the `dynamic_rendering_local_read` C++ define. More on this can be found in the xref:../dynamic_rendering/README.adoc[dynamic rendering sample] readme. For this sample we'll only look at draw time.

== Replacing subpasses for local reads

=== Input attachments

Just like local reads in subpasses, dynamic rendering local read also makes use of input attachments. That should make it easy to convert existing code to this new extension. So unless you do advanced things like input attachment reordering, the changes required to add pixel local reads to dynamic rendering are minimal and only affect the application side. There are no changes to the shader interface, so shaders that have been used with renderpasses + subpasses can be used without any changes. Even with dynamic rendering and local reads you use `subpassInput` and `subpassLoad`.

=== Self-Dependencies

With the `dynamicRenderingLocalReads` feature enabled, it's now possible to use pipeline barriers within dynamic rendering if they include the `VK_DEPENDENCY_BY_REGION_BIT`. Such a barrier makes attachments before the barrier readable as input attachments afterwards. The extension also introduces the new image layout `VK_IMAGE_LAYOUT_RENDERING_LOCAL_READ_KHR` that can be used for storage images and attachments to make writes to those visible via input attachments.

=== Renderpasses with subpasses

. Start a new renderpass with `vkCmdBeginRenderPass` (this also starts the first subpass)
. Fill G-Buffer attachments
. Start the second subpass with `vkCmdNextSubpass`
. Combine G-Buffer attachments using input attachments (and draw to screen using a full-screen quad)
. Start the third subpass with `vkCmdNextSubpass`
. Draw transparent geometry with a forward pass reading depth from an attachment
. End renderpass with `vkCmdEndRenderPass`

=== Dynamic render with local read

. Start dynamic rendering with `vkCmdBeginRenderingKHR`
. Fill G-Buffer attachments
. Insert a memory barrier with the "by region" bit set to make attachment writes visible for input attachment reads for the next draw call
. Combine G-Buffer attachments using input attachments (and draw to screen using a full-screen quad)
. Draw transparent geometry with a forward pass reading depth from an attachment
. End dynamic rendering with `vkCmdEndRenderingKHR`

== Conclusion

With the addition of `VK_KHR_dynamic_rendering_local_read` it's now finally possible to fully replace renderpasses, including those that have multiple subpasses. This makes dynamic rendering a fully fledged replacement for renderpasses on all implementations, including tile based architectures.

== Additional information

*[Extension proposal]
*[Extension blog post]

0 comments on commit d0fda1b

Please sign in to comment.