Skip to content

Audio description failure technique #4390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: AudioDescriptionImportantUnderstanding
Choose a base branch
from

Conversation

mbgower
Copy link
Contributor

@mbgower mbgower commented May 9, 2025

Closes #3806
Addresses #1768

Creates a failure technique to show the conditions under which a video fails 1.2.5 due to important visual information not being described in pauses in the dialogue

Preview

mbgower added 2 commits May 9, 2025 06:17
Addresses #1768

Creates a failure technique to show the conditions under which a video fails 1.2.5 due to important visual information not being described in pauses in the dialogue
@mbgower mbgower mentioned this pull request May 9, 2025
Copy link

netlify bot commented May 9, 2025

Deploy Preview for wcag2 ready!

Name Link
🔨 Latest commit 3bc641f
🔍 Latest deploy log https://app.netlify.com/projects/wcag2/deploys/682779579e2dbe000848b303
😎 Deploy Preview https://deploy-preview-4390--wcag2.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Member

@scottaohara scottaohara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one suggestion. otherwise this makes sense to me - and the other delving into the applicability or essential nature of a pause can be handled in content to be added to the understanding doc

mbgower and others added 3 commits May 12, 2025 06:55
I've removed "must" to address Scott's concern
Remove note styling and add in wording suggested in #3806
I think it may look better with the note. Trying to retain just for the pre-existing paragraph to see how that looks
<p>For each occurrence of synchronized time-based media:</p>
<ol>
<li> Check that all important visual information has been conveyed in the narration.</li>
<li> Check that narration has been added in all appropriate pauses in the dialogue. </li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check that audio description is provided for any visual information necessary to understand the content which is not conveyed by audio in the synchronized media.

Copy link
Contributor Author

@mbgower mbgower May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This second check is there to make sure any available appropriate pauses are used.
The first check has already validated that the audio descriptions either cover all the material, or they don't.
Here, we're testing that all the pauses have been used for narration.
Together they give us the gates for the failure:
#1 passes= no failure
#1 fails and #2 passes=no failure
#1 fails, #2 fails=failure

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original working is actually checking that all possibly useful pauses are used, which may not be appropriate.

<p>This describes a failure condition for all techniques involving audio descriptions. Audio descriptions can either be provided as part of the soundtrack's original narration or be added to audio during pauses in existing dialogue. If important actions, characters, scene changes, and on-screen text are only conveyed visually, appropriate pauses in dialogue need to be used to provide this information as audio descriptions.</p>
<p class="note">Not all pauses are usable for audio descriptions. If the pauses in dialogue are too short (for instance, less than 2 seconds), or do not occur in proximity to the visual content that needs to be described, they may not be appropriate for audio descriptions.</p>
<p>In situations where important visual information is being conveyed at the same time as non-spoken audio (such as music and sound effects), the technique of "audio ducking" can be used. This involves dropping the overall sound level so that it is easier to distinguish the narration that is added during pauses in dialogue.</p>
<p>This technique can work well with background music and sounds, but audio ducking has the potential to mask important audio information. Often such audio information can convey much of the sense of what is visually happening. There may be some pauses in dialogue where non-spoken audio information is so important that the addition of audio descriptions is unnecessary or inappropriate.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having trouble following the point of this paragraph. Is it that ducking can obscure needed audio information or that sometimes the non-spoken audio can eliminate the need for some descriptions?

Copy link
Contributor Author

@mbgower mbgower May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you mean the last one?

Is it that ducking can obscure needed audio information or that sometimes the non-spoken audio can eliminate the need for some descriptions?

A bit of both, I think. There is material about not filling every pause with more information if the audio is actually more important than the video
For instance, see example 2

Some_ examples would probably be useful, although maybe that would be better in a sufficient technique than in a failure technique.

Copy link
Contributor Author

@mbgower mbgower May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@awkawk, let me know if this helps clarify. Open to suggestions!

Suggested change
<p>This technique can work well with background music and sounds, but audio ducking has the potential to mask important audio information. Often such audio information can convey much of the sense of what is visually happening. There may be some pauses in dialogue where non-spoken audio information is so important that the addition of audio descriptions is unnecessary or inappropriate.</p>
<p>This technique can work well with background music and sounds, but audio ducking has the potential to mask important audio information. Sometimes such audio information can convey much of the sense of the video. </p>
<p class="note">There may be some pauses in dialogue where non-spoken audio information is so important that the addition of audio descriptions is inappropriate.</p>

Incorporating the suggestion
@awkawk
Copy link
Member

awkawk commented May 12, 2025

Here's what I was thinking this was testing:

For all synchronized media with video:

  1. Check that all important visual information in the video is conveyed by the main soundtrack.
  2. Check that any important visual information in the video that isn't conveyed in the main soundtrack is conveyed by audio description.

Is this what you are saying?

@mbgower
Copy link
Contributor Author

mbgower commented May 12, 2025

Here's what I was thinking this was testing:

For all synchronized media with video:

  1. Check that all important visual information in the video is conveyed by the main soundtrack.
  2. Check that any important visual information in the video that isn't conveyed in the main soundtrack is conveyed by audio description.

Is this what you are saying?

Nope.

  1. Check that all important visual information in the video is conveyed by the main soundtrack.
  2. Check that all pauses are used for audio descriptions

If you fail both, you fail.

I was hoping that was obvious from the title "Failure of Success Criterion 1.2.5 due to not using available pauses in dialogue to provide audio descriptions of important visual content"

@awkawk
Copy link
Member

awkawk commented May 12, 2025

Nope.

  1. Check that all important visual information in the video is conveyed by the main soundtrack.
  2. Check that all pauses are used for audio descriptions

If you fail both, you fail.

I was hoping that was obvious from the title "Failure of Success Criterion 1.2.5 due to not using available pauses in dialogue to provide audio descriptions of important visual content"

Ha! Hope is dashed! :)

The title is not the test procedure, you can't rely on it.

So you want to combine my 1 and 2 into 1. How about:

  1. Check that all important visual information in the video is conveyed by the main soundtrack or by an audio description track.
  2. Check that the main soundtrack has pauses that would be appropriate to provide audio description of important visual content.

both true: pass
both false: pass
1 true, 2 false: pass
1 false, 2 true: fail

@mbgower
Copy link
Contributor Author

mbgower commented May 13, 2025

That wording of the second check seems harder to parse, to me. I've update the wording to make it more what you wanted. Can you explain where you are primarily snagging on this now?

  1. Check that all important visual information is conveyed by the main soundtrack or by audio descriptions.
  2. Check that audio descriptions have been added in all appropriate pauses in the dialogue.

If both of these are false, the failure applies.

@awkawk
Copy link
Member

awkawk commented May 13, 2025

Can you explain where you are primarily snagging on this now?

You really seem to want to have audio description in all pauses, but this is definitely the snag.

@mbgower
Copy link
Contributor Author

mbgower commented May 13, 2025

Can you explain where you are primarily snagging on this now?

You really seem to want to have audio description in all pauses, but this is definitely the snag.

I would not say that. What the test is doing is checking if all the important video is described. If it's all described, then unused pauses are immaterial, it passes.
If all important visuals are not described, then check that all appropriate pauses have been used. If the meaningful visuals aren't all described and if all the appropriate pauses aren't utilitized, then it fails, because it hasn't met the SC (emphasis on all):

Audio description is provided for all prerecorded video content in synchronized media.

@awkawk
Copy link
Member

awkawk commented May 13, 2025

@mbgower I emailed with Bryan Gould, who is at WGBH-NCAM and worked for many years as a description editor at WGBH-DVS.

Bryan replied to me (quoted sections are from my email to Bryan, non-quoted are his replies):
I’ve responded to your comments below:

My concern is the second check, which suggests that the right way to handle audio description is to fill every available pause with audio description if there is any information deemed important that is missing. I worry that we will get into a situation where well-intentioned people will think that there is some important information from the video that isn’t voiced and therefore must fill every pause.

I agree. For 2, the word “appropriate” provides enough ambiguity that this check may very well be interpreted as a “fill-every-pause” mandate.

If that is the case, the next question is, what constitutes a pause? A lot of information can be conveyed with a single well-placed word, for example, “Later” or “Elsewhere.” These words can be vital to understanding but very few one-second breaks in dialogue require description.

The practice of filling every pause with description may also be detrimental. In order to reduce cognitive load, the viewer may filter out the “excess” description. As a result, the viewer may miss the actually important information.

I suspect that even the best-described video might fail this evaluation, as there is way too much visual information to completely reproduce in AD, and there are times where a description editor needs to make a judgement about the impact on the mood of the content and viewer’s experience.

Correct. A description editor’s main role is to apply context to all visual information in order to deem what is “important” and thus requiring of description. For example, a 10-second view of an orchid requires a different description approach depending on whether the context is botany, photography, or part of a montage set to music. There are also instances when a pause in dialogue is intended for the viewer to reflect on what was just said, this is especially true in education and training videos. In other words, there is almost always too much visual information to describe and the description editor is always deciding what is important. Finally, dedication to context helps to prevent the cognitive load issue raised above.

@mbgower
Copy link
Contributor Author

mbgower commented May 13, 2025

Thanks for providing that context and outside, informed opinion in your latest comment, @awkawk.

There's nothing stated there that is new to me, and I share all those concerns and observations based on my own limited experience in this field. However, this is pretty similar, philosophically, to problems that exist in 1.1.1 for images. In the case of 1.1.1, we realistically have a gross litmus test: "is there an alt (and if not, is this purely decorative)?" This is the basic check every automated checker currently does. And then we have a qualitative test: "is it equivalent?"

We've all seen some pretty awful alt text. But concerns with the quality of the alt has not deterred us from insisting that alt text must exist.

In other words, there is almost always too much visual information to describe and the description editor is always deciding what is important. Finally, dedication to context helps to prevent the cognitive load issue raised above.

That is a fundamental challenge with the art of audio descriptions. The pauses, ultimately, do to a great degree determine what one decides are the important visuals. (I even wrote a blog about this very topic once.) I intentionally incorporated the idea of "appropriate" into the check for pauses to help provide some flexibility there.

For audio descriptions, we only have the text we were given to work. I thought a gross test for 1.2.5 could be "Is there any audio description?" That seems entirely supported by the normative text. But several people have pushed back against that idea.

The only other gross tests we seem to have at our disposal with the normative wording are:

  • are there important visuals with no audio equivalent?
  • are there pauses which could be used to insert audio descriptions?

Both of these are highly subjective. I sympathize with the concerns Bryan states, but the reality is that as a percentage of total videos on the web, the number that are audio described is much closer to 0% than 1%. I think we have to provide some kind of gross test for audio descriptions before we turn to worrying about whether the descriptions are good or not.

Rejecting usable pauses as a metric for 1.2.5 for fear someone creates crappy audio descriptions seems unfortunate to me. I've already opened an issue to create a sufficient technique that documents how to do good audio descriptions. To me, that is a better way of addressing his concern than ruling out "pauses" as a part of a failure assessment.

@mbgower
Copy link
Contributor Author

mbgower commented May 14, 2025

@awkawk a compromise is possibly to remove "all" from both of the checks . I'm not keen on it (since it really lowers the bar for passing) BUT it would create a higher bar for failing, which I believe is what you're advocating for?

  1. Check that all important visual information has been conveyed in the narration.
  2. Check that narration has been added in all appropriate pauses in the dialogue.

Follow up: Or maybe the "all" only needs to go on the second check?

@mbgower
Copy link
Contributor Author

mbgower commented May 15, 2025

After some discussion with @awkawk , we tentatively arrived at:

Test Procedure
For each occurrence of synchronized media containing video:

  1. Check that all important visual information has been conveyed in the narration.
  2. Check that narration has been added in pauses in the dialogue, where appropriate.

If checks 1 and 2 are false, then this failure condition applies and the content fails the success criterion.


Note that I have only updated the original PR language here. I haven't incorporated any of the other suggestions to date.

@mbgower
Copy link
Contributor Author

mbgower commented May 16, 2025

Moved to ready for approval; will move back depending on @awkawk response.

Copy link
Contributor

@bruce-usab bruce-usab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the 2nd step of test procedure tracks closely enough to the failure name and it's not as simple as it could be. I recommend:

For each occurrence of synchronized media containing video:

  1. Check that all important visual information has been conveyed in the narration.
  2. Check that there are available pauses in dialogue where narration could be added.

If check 1 is false and check 2 is true, then this failure condition applies and the content fails the success criterion.

mbgower added 2 commits May 16, 2025 10:27
Incorporating recommendation from @awkawk:
 "I think that we should differentiate between narration in #2 and narration in #1 by using audio description in #2 instead of narration.
Incorporated based on discussion with @awkawk
@mbgower
Copy link
Contributor Author

mbgower commented May 16, 2025

@bruce-usab

I don't think the 2nd step of test procedure tracks closely enough to the failure name and it's not as simple as it could be.

The PR was updated after your comment, and may have at least partially addressed. Please review and comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants