Skip to content

Observation Algorithms

yanj-github edited this page Aug 18, 2023 · 6 revisions

Video Observation Algorithms

This section defines observation algorithms for each video observation.

  • Detected Mezzanine QR codes: [QRa, QRb, QRc .... QRn]
  • Sample duration in ms of recording: Dr = 1000ms / camera_frame_rate
  • Delay in QR creation on Test Runner (extracted from T.R. QR code): d
  • Maximum permitted startup delay (CTA-2003: 120ms): TSMax
  • A FIXED arbitrary value equivalent to calculate a half of the recording frame: CAMERA_FRAME_ADJUSTMENT = 0.5
  • tolerance = 20ms

Every frame shall be rendered and the samples shall be rendered in increasing presentation time order

  1. check the first frame is rendered.

     QRa.mezzanine_frame_num == 1
    

    for the random access to frame

     QRa.mezzanine_frame_num == random_access_frame_num
    

    for the random access to time

     QRa.mezzanine_frame_num == rounddown(random_access_time * mezzanine_frame_rate)
    
  2. check the last frame is rendered. Half frame duration is to be added before rounddown for calculating correct frame number regardless the way cmaf_track_duration rounded.

     QRn.mezzanine_frame_num == rounddown((cmaf_track_duration + half_frame_duration) * mezzanine_frame_rate)
    
  3. check that the frames shall be rendered in increasing order.

     for QRb to QRn:
         QR[i-1].mezzanine_frame_num + 1 == QR[i].mezzanine_frame_num
    

Switching and Splicing test

  1. Check the first sample is rendered.
  2. Check the last sample is rendered.
  3. Based on playout parameter work out the switching and splicing point.
  4. Based on step 3 check the ending frame and starting frame at the switching and splicing point.
  5. Check that the samples shall be rendered in increasing order with in the same switching and splicing block.

Play back with gaps test

e.g: Low-Latency: Playback over Gaps

  1. Check the first sample is rendered.
  2. Check the last sample is rendered.
  3. Based parameters work out the start and end point of the gap.
  4. Based on step 3 check the ending frame and starting frame at the gap. Tolerance required to be considered before the gap start. However, no tolerance is expected after the end of the gap.
  5. Check that the samples shall be rendered and in increasing order from playback start till the gap, and from after the gap till the end of playback.

Truncated Playback and Restart

  1. Check the first sample is rendered.
  2. Check the last sample is rendered.
  3. Based on playout and second_playout_switching_time parameter works out the switching point.
  4. Based on step 3 check the ending frame and starting frame at the switching point.
  5. Check that the samples shall be rendered in increasing order within the same switching block.
  6. Based on step 3 check the ending frame is within second_playout_switching_time and end of playout. And check that frames rendered only once.
  7. Based on second_playout parameter works out the switching point.
  8. Based on step 7 check the ending frame and starting frame at the switching point.
  9. Check that the samples shall be rendered in increasing order within the same switching block.

The playback duration of the playback matches expected duration

Some devices display frame 1 before the play event. Measurement starts from a detected frame after the play(). There are some devices hold on to the last frame until the next test is loaded. Actual playback duration is measured from the first detection time of the first detected frame till the first detection time of the last detected frame. The last frame duration is added to account for the whole duration. expected_track_duration = cmaf_track_duration - start_missing_frame_duration - ending_missing_frame_duration - frame_duration_prior_to_play actual_playback_duration = (QRn.first_appear_camera_frame_num - QRa.first_appear_camera_frame_num) * Dr + last_frame_duration actual_playback_duration == expected_track_duration +/- tolerance Additional tolerance should be considered for the tests where the bigger tolerances should be used.


NOTE Devices hold on to the last frame is a mirror image of issue with the first frame. However, Observation Framework cannot handle it in a same way that it does for the first frame. The finished() event cannot be calculated of its actual position in time. Because this is the last event. There is no following event from Test Runner, so that the QR code generation delay cannot be obtained to the finished() event.


Test with waiting in playback

e.g: Buffer Underrun and Recovery Test Runner signals "waiting" status when the waiting occurs. Playback waiting duration is calculated from 1st detection of "waiting" status (waiting_start_time) till it changes back to "playing" (playing_start_time). However, "waiting" status before the playback start should be ignored. Total waiting duration should take into account of the testing limitations, the status QR code detection might be delayed for 1000/camera_frame_rate. Then expected_track_duration should be adjusted to add detected waiting duration.

    min_gap_duration += playing_start_time - waiting_start_time - 1000/camera_frame_rate
    max_gap_duration += playing_start_time - waiting_start_time + 1000/camera_frame_rate

Truncated Playback and Restart

Duration checks for each presentation should be observed separately. Start frames adjustment required for 1st presentation while the ending frame adjustment required for 2nd presentation. Playback duration of presentation one should be more than second_playout_switching_time.

The start-up delay should be sufficiently low, i.e., TR [k, 1] – Ti < TSMax

start_up_delay is calculated on first appear camera frame number after play(). If device display frame 1 before play() then start up delay is measured to frame 2, when there are some frames missing at the beginning the start-up delay is measured to the first detected frame.

    start_up_delay = (QRa.first_appear_camera_frame_num_after_play * Dr) - ((play_event.camera_frame_num * Dr) - d)
    start_up_delay < TSMax

The presented sample matches the one reported by the currentTime value within the tolerance of the sample duration

    sample_tolerance_in_recording = ct_frame_tolerance * 1000/mezzanine_frame_rate/(1000/camera_frame_rate) = camera_frame_rate/mezzanine_frame_rate
    sample_tolerance = ct_frame_tolerance * 1000/mezzanine_frame_rate
    
    target_camera_frame_num_of_ct_event = ct_event.first_seen_camera_frame_num - (ct_event.d / Dr)
    first_possible_camera_frame_num_of_target = target_camera_frame_num_of_ct_event - CAMERA_FRAME_ADJUSTMENT - sample_tolerance_in_recording
    last_possible_camera_frame_num_of_target = target_camera_frame_num_of_ct_event + CAMERA_FRAME_ADJUSTMENT  + sample_tolerance_in_recording
    
    for first_possible_camera_frame_num_of_target to last_possible_camera_frame_num_of_target
            foreach mezzanine_qr_code on camera_frame that within the range
                    if mezzanine_qr_code.media_time == (ct_event.current_time +/- (sample_tolerance + tolerance))
                            test is PASSED

For the splicing test the actual media time is calculated by adding previous period duration

    previous_period is calculated based on playout parameter from the TR.
    mezzanine_qr_code.media_time[i] = mezzanine_qr_code.media_time[i] + previous_period

For the Low-Latency (1): Initialization test

The current time checks for "playing", "play", "current time = 0.0" is ignored. This is because the "playing" will change to "waiting" and 1st frame won't be rendered until successfully appending the first CMAF fragment.

Measure the time between the successful appending of the first CMAF chunk that exceeded min_buffer_duration and the first media sample being visible or audible. This value shall be compared against render_threshold.

This observation is similar to start up delay. However, it is measured from appending of the first CMAF chunk, Test Runner signals "appended" event on the successful appending of the first CMAF chunk.

Audio Observation Algorithms

This section defines observation algorithms for each audio observation.

Audio mezzanine is cut into small segments in 20ms. Cross-correlation is used to compare mezzanine with recording and obtain offset timings for each segment from recording. To speed up the calculation only check in the expected neighbourhood of the segment, 500ms sample is taken from the recording file on expected position, instead of finding matches with whole recording file.

  • Detected audio segments timings in recording: [ASa, ASb, ASc .... ASn] where ASa=0ms
  • Audio media time: [ASa.media_time = 0ms, ASb.media_time = 20ms, ASc.media_time = 40ms .... ASn.media_time = 20ms*i]
  • Audio segment length: audio_sample_length = 20ms
  • Maximum permitted startup delay (CTA-2003: 120ms): TSMax
  • A FIXED arbitrary value equivalent to calculate a half of the recording frame: CAMERA_FRAME_ADJUSTMENT = 0.5
  • tolerance = 20ms
  • Audio-Video Synchronization tolerance: av_sync_tolerance = 40, -120

Every sample shall be rendered and the samples shall be rendered in increasing presentation time order

For each audio segment compare its detected audio segments timings in recording with its media time. If not match, then report the failure. for ASa to ASn: abs(AS[i] - AS[i].media_time) <= audio_sample_length

Random Access Tests

Expected mezzanine audio signal is calculate from random access point till the end.

Splicing tests

Expected mezzanine audio signal is calculate from playout parameter.

The playback duration of the playback matches expected duration

Calculate detected audio duration. Measurement starts from the beginning of first detected audio segment to the finishing time of last detected audio segment. Duration check should consider duration missing either end of playback. expected_track_duration = cmaf_track_duration - start_missing_duration - ending_missing_duration actual_playback_duration = ASn - ASa + audio_sample_length actual_playback_duration == expected_track_duration +/- tolerance

Random Access Tests

Expected duration is calculate from random access point till the end.

Splicing tests

Expected duration is calculate from playout parameter.

The start-up delay should be sufficiently low, i.e., TR [k, 1] – Ti < TSMax

start_up_delay is calculated on first detected audio segment after play().

    start_up_delay = ASa - ((play_event.camera_frame_num * Dr) - d)
    start_up_delay < TSMax

NOTE Detected audio segments timings in recording is checked in "Every sample shall be rendered, and the samples shall be rendered in increasing presentation time order" and any failing segment are removed. When there are some missing audio segments at the beginning "ASa" is detected earliest audio segment.


The presented sample matches the one reported by the currentTime value within the tolerance of the sample duration

    sample_tolerance = audio_sample_tolerance * audio_sample_length

    target_camera_frame_num_of_ct_event = ct_event.first_seen_camera_frame_num - (ct_event.d / Dr)
    first_possible_camera_frame_num_of_target = target_camera_frame_num_of_ct_event - CAMERA_FRAME_ADJUSTMENT - sample_tolerance
    last_possible_camera_frame_num_of_target = target_camera_frame_num_of_ct_event + CAMERA_FRAME_ADJUSTMENT  + sample_tolerance

    for first_possible_camera_frame_num_of_target to last_possible_camera_frame_num_of_target
            foreach audio_segment that within the range
                    if audio_segment.media_time == (ct_event.current_time +/- (sample_tolerance + tolerance))
                            test is PASSED

Combined Audio and Video Observation Algorithms

This section defines observation algorithms for combined audio and video observation.

Every sample for every media type included in the CMAF Presentation duration shall be rendered and shall be rendered in order

Video and audio observations are separately made, and results are given separately.

The playback duration of the playback matches expected duration

Video and audio observations are separately made, and results are given separately.

The presentation starts with the earliest video sample and the audio sample that corresponds to the same presentation time

This observation is same as "The presented sample matches the one reported by the currentTime value within the tolerance of the sample duration", but the check will be done for the 1st frame.

Audio-Video Synchronization: The mediaTime of the presented audio sample matches the one reported by the video currentTime value within the tolerance of +40 mS / -120 mS

  1. Calculate video offset Calculate mean detection time based on 1st and last QR code detection time. For 1st frame it might be rendered before play so take the last detection time while for last frame it might rendered after play stopped so take the first detection time. Calculate offset of detection time and media time for each detection frames. Clean up the offsets to smooth out the video detection limitations.

  2. Calculate audio offset Calculate offsets of detection time and media time for each audio segment.

  3. Check for A/V sync within the tolerance Find the matching audio and video based on the same media time the differences should be less than an audio sample length if longer the matches not found ignore the audio samples which failed to find matching audio where the sync is not measurable.