KHR audio framework design proposal #2421

cashah · 2024-07-02T16:53:13Z

During the recent 3D Formats working group meeting, we reviewed the proposal to define the KHR audio glTF specification using an audio graph framework. The purpose of the current document is to delve deeper into that proposal, offering a comprehensive design of the KHR audio graph. This includes a detailed description of each node object within the graph along with functionality, the specific properties associated with it, and how it interacts with other nodes in the graph. The document is structured to facilitate clear understanding and to solicit feedback on the proposed design. Based on the feedback we will update and finalize the design before it is formally schematized into KHR core audio spec, extensions, animations, and interactivity.

docEdub · 2024-07-02T18:16:03Z

extensions/2.0/Khronos/KHR_audio_graph/README.md

+   </td>
+   <td>string
+   </td>
+   <td>Shape in which emitter emits audio (cone, omnidirectional, custom).


Why is custom included here? Is there a way of defining a custom shape? How would it be implemented?

It would be implemented by another glTF extension, extending this extension.

docEdub · 2024-07-02T18:20:22Z

extensions/2.0/Khronos/KHR_audio_graph/README.md

+   </td>
+   <td>string
+   </td>
+   <td>Shape in which emitter emits audio (cone, omnidirectional, custom).


Does omnidirectional need to be included when it can be achieved with a 360-degree cone angle?

Note: This is how KHR_audio_emitter works. However, explicitness is nice, so I am wondering if it makes sense to change KHR_audio_emitter to use a type enum like this.

Keeping it explicit to make it runtime friendly.

docEdub · 2024-07-02T18:35:42Z

extensions/2.0/Khronos/KHR_audio_graph/README.md

+
+
+
+### 4.3 Oscillator data


Has band-limiting been considered for audio rate oscillator implementations? Is it assumed? Or does it need to be explicitly required? I would think it should be explicitly required if audio quality is a priority.

Made that explicit by including a detune parameter. The implementation can then use this with frequency to calculate the desired oscillator frequency.

Ok, thanks, but I don't think adding a detune parameter has anything to do with making band-limiting explicit. If it's assumed that implementations will apply band-limiting to avoid aliasing, then that's fine, but I suspect many implementations will take the easier route of skipping band-limiting if it's not required, which will reduce audio quality.

It may be that this concern is outside the scope of the standard, but if there a lot of poor-quality implementations then it may cause creators to avoid using it, so we might want to make band-limiting required.

There are various practical methods an implementation might use to prevent aliasing. While the ideal discrete-time digital audio signal is clearly defined mathematically, each implementation must balance the trade-off between computational cost and fidelity to this ideal. It is anticipated that implementations will strive towards this ideal, but it is also reasonable to consider lower-quality, more cost-effective approaches for less powerful hardware. We can include more guidance/recommendation around this.

docEdub · 2024-07-02T18:38:38Z

extensions/2.0/Khronos/KHR_audio_graph/README.md

+
+### 6.8 Filter node (1 input / 1 output)
+
+Use the Audio Mixer node to combine the output from multiple audio sources. A filter node always has exactly one input and one output.


This looks to be partially copied from the previous item by mistake.

docEdub · 2024-07-02T18:52:15Z

extensions/2.0/Khronos/KHR_audio_graph/README.md

+
+
+
+### 6.9 Reverb node (1 input / 1 output)


Reverb implementations vary widely. Has any thought been given to consistency across implementations for this node? Web-based implementations would have to create their own reverb nodes since there is no WebAudio reverb node, yet. Would a Convolver node make more sense?

Good callout, extended support for convolution reverb in addition to algorithmic reverb.

Ok, thanks, but this doesn't address my concern about the algorithmic reverb sounding different across implementations. I don't think sound designers are going to use a reverb if it doesn't sound the same everywhere, so how do we get the algorithmic reverb to sound the same on all implementations? Can a specific reverb algorithm be required by the standard, like maybe a basic plate reverb?

Got it. Unfortunately, there is no universal standard for reverb implementation, which can vary significantly in core business logic and its complexity. Many implementations remain unpublished. Therefore, the objective is not to restrict reverb to a particular implementation but to rather parameterize it sufficiently to accommodate most implementations. Requiring a specific reverb algorithm as a standard would, in fact, diminish the value for users. That said, one can always leverage convolution reverb with a specific impulse response.

Requiring a specific reverb algorithm as a standard would, in fact, diminish the value for users.

Maybe, but I think users won't create graphs with algorithmic reverbs because they won't know how they'll sound on all implementations. Instead, they'll bake the exact reverb they want directly into the audio sources. They'll do this with any audio nodes that vary significantly across implementations.

because they won't know how they'll sound on all implementations

That's a valid point, and it extends beyond just different implementations to the same implementation being used across various device types, such as mobile, desktops, and HMDs. Each device type, and indeed each variant within those types, can potentially alter the sound characteristics, making it challenging for users to predict how the reverb will sound without testing it on each device themselves. Moreover, developers often face a trade-off between computational cost and fidelity in their implementations. This balance is crucial to ensure that the reverb not only meets quality expectations but also remains computationally feasible and customizable across diverse hardware platforms.

aaronfranke

There is a lot missing from this PR. Is KHR_audio_graph implemented anywhere? Are there any sample assets? Where are the JSON schemas?

Please define the relationship between KHR_audio_graph and KHR_audio_emitter. An audio graph is too complicated for most scenes and engines, which often only need to define basic emission. The Google Slides document linked in KHR_audio_graph mentions Unity DSPGraph, which is an extra package, not part of the base Unity. I wrote a longer list of specific discussion points here: #2137 (comment)

aaronfranke · 2024-07-09T03:10:47Z

extensions/2.0/Khronos/KHR_audio_graph/README.md

+
+
+
+## **Contributors**


Headings shouldn't have any additional formatting symbols.

Suggested change

## **Contributors**

## Contributors

aaronfranke · 2024-07-09T03:11:18Z

extensions/2.0/Khronos/KHR_audio_graph/README.md

+* Chintan Shah, Meta
+* Alexey Medvedev, Meta


The Khronos preferred Markdown style is to use - for unordered lists:

Suggested change

* Chintan Shah, Meta

* Alexey Medvedev, Meta

- Chintan Shah, Meta

- Alexey Medvedev, Meta

aaronfranke · 2024-07-09T03:11:42Z

extensions/2.0/Khronos/KHR_audio_graph/README.md

+Using this Markdown file:
+
+1. Paste this output into your source file.
+2. See the notes and action items below regarding this conversion run.
+3. Check the rendered output (headings, lists, code blocks, tables) for proper
+   formatting and use a linkchecker before you publish this page.


Is this leftover from something? It should be removed.

aaronfranke · 2024-07-09T03:12:21Z

extensions/2.0/Khronos/KHR_audio_graph/README.md

+
+## Context
+
+During the recent Khronos 3D formats working group meeting held on 5/29, we reviewed the [proposal to define the KHR audio glTF specification using an audio graph framework](https://docs.google.com/presentation/d/1IrrQaE-jHyzOtFRabjtLAzeP5UOirFOEj8FRADAceqk/edit?usp=sharing). The purpose of this document is to delve deeper into that proposal, offering a comprehensive design of the KHR audio graph. This includes a detailed description of each node object within the graph along with functionality, the specific properties associated with it, and how it interacts with other nodes in the graph. The document is structured to facilitate clear understanding and to solicit feedback on the proposed design. Based on the feedback we will update and finalize the design before it is formally schematized into KHR core audio spec, extensions, animations, and interactivity.


Please include the year and use ISO 8601 to make this unambiguous to future readers.

Suggested change

During the recent Khronos 3D formats working group meeting held on 5/29, we reviewed the [proposal to define the KHR audio glTF specification using an audio graph framework](https://docs.google.com/presentation/d/1IrrQaE-jHyzOtFRabjtLAzeP5UOirFOEj8FRADAceqk/edit?usp=sharing). The purpose of this document is to delve deeper into that proposal, offering a comprehensive design of the KHR audio graph. This includes a detailed description of each node object within the graph along with functionality, the specific properties associated with it, and how it interacts with other nodes in the graph. The document is structured to facilitate clear understanding and to solicit feedback on the proposed design. Based on the feedback we will update and finalize the design before it is formally schematized into KHR core audio spec, extensions, animations, and interactivity.

During the recent Khronos 3D formats working group meeting held on 2024-05-29, we reviewed the [proposal to define the KHR audio glTF specification using an audio graph framework](https://docs.google.com/presentation/d/1IrrQaE-jHyzOtFRabjtLAzeP5UOirFOEj8FRADAceqk/edit?usp=sharing). The purpose of this document is to delve deeper into that proposal, offering a comprehensive design of the KHR audio graph. This includes a detailed description of each node object within the graph along with functionality, the specific properties associated with it, and how it interacts with other nodes in the graph. The document is structured to facilitate clear understanding and to solicit feedback on the proposed design. Based on the feedback we will update and finalize the design before it is formally schematized into KHR core audio spec, extensions, animations, and interactivity.

1/ Included detune to oscillator property. The implementation can then combine detune with frequency to calculate the desired oscillator frequency. 2/ Extended reverb node to support algorithmic as well as convolution reverb implementations.

1/ Removed custom enum types, these will be implicit with custom extensions.

andybak · 2024-11-09T12:55:14Z

Is there any documentation or records of discussions that covers the "why" rather than the "what"?

i.e the case for why this belongs in GLTF, discussions on challenges related to implementation and adoption and any exploration of the costs vs the benefits.

I have no doubt this conversations happened - are they published anywhere?

KHR audio graph design

bc3404e

cashah requested a review from rudybear July 2, 2024 16:53

rudybear approved these changes Jul 2, 2024

View reviewed changes

docEdub reviewed Jul 2, 2024

View reviewed changes

aaronfranke suggested changes Jul 9, 2024

View reviewed changes

Major updates:

af425c3

1/ Included detune to oscillator property. The implementation can then combine detune with frequency to calculate the desired oscillator frequency. 2/ Extended reverb node to support algorithmic as well as convolution reverb implementations.

aaronfranke mentioned this pull request Jul 17, 2024

Add "shapeType" property to positional audio emitters omigroup/gltf-extensions#229

Merged

Change log:

6690f85

1/ Removed custom enum types, these will be implicit with custom extensions.

aaronfranke mentioned this pull request Oct 28, 2024

KHR_audio_emitter #2137

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KHR audio framework design proposal #2421

KHR audio framework design proposal #2421

cashah commented Jul 2, 2024

docEdub Jul 2, 2024

aaronfranke Jul 14, 2024

docEdub Jul 2, 2024

aaronfranke Jul 14, 2024 •

edited

Loading

cashah Jul 17, 2024

docEdub Jul 2, 2024

cashah Jul 17, 2024

docEdub Jul 17, 2024

cashah Jul 17, 2024

docEdub Jul 2, 2024

cashah Jul 17, 2024

docEdub Jul 2, 2024

cashah Jul 17, 2024

docEdub Jul 17, 2024

cashah Jul 17, 2024

docEdub Jul 17, 2024

cashah Jul 17, 2024 •

edited

Loading

aaronfranke left a comment

aaronfranke Jul 9, 2024

aaronfranke Jul 9, 2024

aaronfranke Jul 9, 2024

aaronfranke Jul 9, 2024

andybak commented Nov 9, 2024


		### 6.8 Filter node (1 input / 1 output)

		Use the Audio Mixer node to combine the output from multiple audio sources. A filter node always has exactly one input and one output.


		## Context

		During the recent Khronos 3D formats working group meeting held on 5/29, we reviewed the [proposal to define the KHR audio glTF specification using an audio graph framework](https://docs.google.com/presentation/d/1IrrQaE-jHyzOtFRabjtLAzeP5UOirFOEj8FRADAceqk/edit?usp=sharing). The purpose of this document is to delve deeper into that proposal, offering a comprehensive design of the KHR audio graph. This includes a detailed description of each node object within the graph along with functionality, the specific properties associated with it, and how it interacts with other nodes in the graph. The document is structured to facilitate clear understanding and to solicit feedback on the proposed design. Based on the feedback we will update and finalize the design before it is formally schematized into KHR core audio spec, extensions, animations, and interactivity.

KHR audio framework design proposal #2421

Are you sure you want to change the base?

KHR audio framework design proposal #2421

Conversation

cashah commented Jul 2, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aaronfranke Jul 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cashah Jul 17, 2024 • edited Loading

Choose a reason for hiding this comment

aaronfranke left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andybak commented Nov 9, 2024

aaronfranke Jul 14, 2024 •

edited

Loading

cashah Jul 17, 2024 •

edited

Loading