-
Notifications
You must be signed in to change notification settings - Fork 461
[WIP] Set up a Rust native version of Hardsubx context and rewrite hardsubx.c #1458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
shashwat1002
commented
Oct 31, 2022
- constructor function (new) made that takes options as argument [this will replace init_hardsubx]
- default function for hardsubx context
- default function for cc_subtitle
- constructor function (new) made that takes options as argument [this will replace init_hardsubx] - default function for hardsubx context - default function for cc_subtitle
@PunitLodha please check if the new struct looks okay |
…xt and define AVPacket
328e45a
to
d32533e
Compare
de8b351
to
ceeff56
Compare
@prateekmedia Since you made the change to |
@PunitLodha you can check what the CI says for build_ocr_hardsubx linux, if it runs fine then no change else I think there might be some imports that need to be done. Like ffmpeg_sys_next::* would not be equivalent to rsmpeg::*, some dependencies are needed to be include manually. Check my implementation on mod.rs and utility.rs |
@prateekmedia @IshanGrover2004 @PunitLodha thoughts on this PR? |
Code looks fine, but it needs testing before it can be merged. @prateekmedia @cfsmp3 |
@shashwat1002 if this still applies can you rebase? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR sets up a native Rust version of the HardsubX context and rewrites portions of the hardsubx C implementation. The changes include introducing a new constructor (new) that takes options as an argument, adding default implementations for the hardsubx context and cc_subtitle, and refactoring OCR-related functions to use native Rust types.
Reviewed Changes
Copilot reviewed 9 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
src/rust/src/lib.rs | Introduces a new enum for encoding types required for native HardsubX processing |
src/rust/src/hardsubx/utility.rs | Adds an edit_distance_string function using Rust idioms |
src/rust/src/hardsubx/mod.rs | Adds context-related enums, struct initializations, and a new constructor for context |
src/rust/src/hardsubx/main.rs | Integrates the new context and processing flows for burned-in subtitle extraction |
src/rust/src/hardsubx/classifier.rs | Refactors OCR functions to return Rust Strings rather than raw C strings |
src/rust/build.rs | Updates the allowlist and symbol list for FFI bindings |
src/rust/Cargo.toml | Adds new dependencies to support numeric types and traits |
Files not reviewed (2)
- src/lib_ccx/hardsubx.c: Language not supported
- src/rust/wrapper.h: Language not supported
Comments suppressed due to low confidence (2)
src/rust/src/lib.rs:31
- [nitpick] The enum name 'ccx_encoding_type' does not follow Rust naming conventions. Consider renaming it to 'CcxEncodingType'.
pub enum ccx_encoding_type {
src/rust/src/hardsubx/mod.rs:24
- The import 'use std::matches;' is unnecessary or incorrect since the 'matches!' macro is available in the prelude. Consider removing this import.
use std::matches;
} else { | ||
text_out = TessBaseAPIGetUTF8Text((*ctx).tess_handle); | ||
text_out = TessBaseAPIGetUTF8Text(ctx.tess_handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory allocated by TessBaseAPIGetUTF8Text is not freed, which may lead to a memory leak. Ensure that TessDeleteText is called on the returned pointer after converting it to a Rust String.
Copilot uses AI. Check for mistakes.