Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated vs Human monitored audio testing #37

Open
pmaness opened this issue Jun 22, 2021 · 5 comments
Open

Automated vs Human monitored audio testing #37

pmaness opened this issue Jun 22, 2021 · 5 comments
Assignees
Labels
Deferred Deferred to future work

Comments

@pmaness
Copy link

pmaness commented Jun 22, 2021

I support what Dolby is proposing for audio imprinted with coded spectral bands for automated testing. However, it might be useful to have a separate, and likely much more limited, set of files suitable for human monitoring. This would be useful for an initial verification that the test rig and DUT are operating as expected, or other cases where some debugging is necessary.

@jpiesing
Copy link

This would be useful for an initial verification that the test rig and DUT are operating as expected, or other cases where some debugging is necessary.

I very much agree with both of these points.

@jpiesing
Copy link

I'm not sure what this is asking for. @pmaness

  • Is it asking for files with (for example) rising & falling frequencies where a human might hear a discontinuity? Separate files with this would be a pain.
  • Is it asking for something like we with video, something distinctive at the start & end of the mezzanine content so a human could at least notice if the playback was being cropped.

@cta-source
Copy link
Contributor

This topic addresses the point I raised in a recent DPCTF Test Runner meeting, design of audio tests (now that we have some results from the audio watermark study). The broader question is, what is the full set of audio tests, so this is helpful.

@pmaness, regarding something

useful for an initial verification that the test rig and DUT are operating
as expected, or other cases where some debugging is necessary

One debugging tool might be a descending PN sequence--our standard PN sequence, full scale at 0-1 seconds, down 1dB at 1-2s, down 2dB from 2-3s, and so on. Wired, we should be able to catch every PN segment starting from 0-1s. Speaker/mic, we'd be able to detect the effect SNR. If the equipment or environment have problems -- say, the user selected the wrong microphone on the OF computer and has terrible recorded audio, or there is a lot of background noise in the test room -- this would show up in the detected audio SNR with this waveform. (The resulting SNR would be an estimate only, but should be a good indicator.)

Ascending and descending tones, may also be helpful for human listeners, but as you probably know, chord progressions are better for human testing since pure tones create audio spatial nulls (which humans perceive as possible audio dropouts).

@jpiesing, regarding,

something like we with video, something distinctive at the start & end
of the mezzanine content so a human could at least notice if the playback was being cropped

We could do a start-of-audio click at the front of each audio track. (I'd like to move from beeps to clicks anyway, at least for human sync purposes. A click is more precise in timing.) The initial click would be timed to match the "start of video" signal, ofc.

These are just reactions or ideas. As we develop our full set of audio tests, we can keep "humans" in mind. But as a starting point,

Audio Tests for System Verification
System SNR Estimation -- Automated test; use to verify equipment meets basic minimum quality-of-transmission requirements
Chord Progressions -- Human listening test; use when desiring to hear basic system operation. Built from synthesized chords with a click before and after each one, e.g.: Click, C-major (1 second), click, D-major (1s), click, E-major (1s), etc. Actual chords are tbd.

@pmaness
Copy link
Author

pmaness commented Sep 24, 2021

One that we use in DTS labs is a channel name call out (a human voice) synced to an animated speaker layout, highlighting the called channel. You know on the first call out (left channel) that video is aligned (more or less) and audio is being routed correctly. If a system is configured for multi-channel, this allows an initial check that test environment is correct.
Click / flash tracks are much better for measuring A/V sync, and A/V drift in an automated environment.
Pure tones can be used to verify correct downmix in an automated environment.

@gitwjr
Copy link

gitwjr commented Nov 22, 2022

Part of RFC. Deferred for future work.

@gitwjr gitwjr added the Deferred Deferred to future work label Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deferred Deferred to future work
Projects
None yet
Development

No branches or pull requests

4 participants