Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: return avg_logprobs from faster_whisper #413

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

werkamsus
Copy link

faster-whisper can output avg_logsprobs - this is a patch to pass them to whisperx.transcribe.
kept changes minimal & tested everything.

note: I couldn't figure out how to pass the avg_logsprobs to alignment / diarization & built a workaround for myself, you can call it after alignment / diarization if you save the initial result + the last result:

def match_segments(first_dict, second_dict):
    aligned_segments = []

    for first_segment in first_dict['segments']:
        aligned_segment = None
        closest_difference = float('inf')

        for second_segment in second_dict['segments']:
            # Compute the difference in start and end times
            start_difference = abs(first_segment['start'] - second_segment['start'])
            end_difference = abs(first_segment['end'] - second_segment['end'])
            total_difference = start_difference + end_difference

            # Check if this segment is the closest match
            if total_difference < closest_difference:
                closest_difference = total_difference
                aligned_segment = second_segment.copy()

        # Add avg_logprob to the closest matching segment
        if aligned_segment:
            aligned_segment['avg_logprob'] = first_segment['avg_logprob']
            aligned_segments.append(aligned_segment)

    return {'segments': aligned_segments}


matched_segments = match_segments(result, result_align)

@werkamsus werkamsus mentioned this pull request Aug 8, 2023
@pprobst
Copy link

pprobst commented Sep 22, 2023

This is very relevant!

@Tejaswgupta
Copy link

@m-bain would you be merging this one in?

@lfcnassif
Copy link

+1

Thanks for this great project! Can this PR be used in production? We have a hard requirement to return to our users the transcription average confidence score.

@lfcnassif
Copy link

lfcnassif commented Apr 28, 2024

For those interested, I copied this commit over current last main commit into this fork and branch:
https://github.com/sepinf-inc/whisperX/tree/confidence_score

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants