-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added --limit=<seconds> flag to csv export. (Includes tests.) #15
Conversation
…ll be processed, then substudy will stop early. Addresses feature request emk#14. Added tests to confirm that the generated csv contains expected data. Bumped cli_test_dir to version 0.1.2 to take advantage of new expect_file_contents and expect_contains methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great PR! Thank you very much for sending it.
I have just a couple of ideas for improvement, because I'm picky and unreasonable. 😉 But I think this would be a great addition.
And thank you so much for including tests. Yay, tests!
src/bin/substudy.rs
Outdated
@@ -21,7 +21,7 @@ Subtitle processing tools for students of foreign languages | |||
|
|||
Usage: substudy clean <subs> | |||
substudy combine <foreign-subs> <native-subs> | |||
substudy export csv <video> <foreign-subs> [<native-subs>] | |||
substudy export csv [--limit=<secs>] <video> <foreign-subs> [<native-subs>] | |||
substudy export review <video> <foreign-subs> [<native-subs>] | |||
substudy export tracks <video> <foreign-subs> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two thoughts:
- Can we find a better name than
--limit
? - If we're going to support this for
export csv
, can we support it for all theexport
subcommands? I think they mostly use the same code internally, don't they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't particularly like the name "limit" either, but I couldn't think of a better one at the time. "End-at" is much better, thanks! I've changed it to "end-at" throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to add it to the other exports before I knew if you'd even accept this PR, but I agree with you, it's much better to be consistent and not surprise/confuse users. So I've added it to the other 2 export
subcommands.
src/export/exporter.rs
Outdated
@@ -129,6 +134,11 @@ impl Exporter { | |||
self.native.as_ref() | |||
} | |||
|
|||
/// Get the seconds to limit processing to. | |||
pub fn processing_limit(&self) -> Option<f32> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might be able to improve this name, too. processing_limit
doesn't specify either what kind of processing we're doing, or the units we use to measure the limit. Maybe end_time
? And then we could change --limit
to --end-time
, perhaps? I'm open to other suggestions, of course!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or even --end-at
and end_at
if that looks better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to end[-_]at
! :)
tests/tests.rs
Outdated
@@ -54,6 +54,10 @@ fn cmd_combine() { | |||
assert!(from_utf8(&output.stdout).unwrap().find("¡Si!").is_some()); | |||
} | |||
|
|||
static FIRST_LINE: &'static str = "[sound:empty_00060_828-00066_164.es.mp3],0:01:00.828,empty,\"<img src=\"\"empty_00063_496.jpg\"\" />\",¡Si! ¡Aang ha vuelto!,Yay! Yay! Aang\'s back!,,,¡Lo sabía!,I knew it!\n"; | |||
|
|||
static LAST_LINE: &'static str = "[sound:empty_00077_124-00083_379.es.mp3],0:01:17.124,empty,\"<img src=\"\"empty_00080_251.jpg\"\" />\",\"El no ha hecho nada, Sokka, fué un accidente.\",Aang didn\'t do anything.,harás que vengan a nosotros.,\"You\'re leading them straight to us, aren\'t you?\",,\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FIRST_LINE
and LAST_LINE
test data is too specific, and it will cause the tests to break whenever we change the details of the sound naming, the exact time padding, or the HTML we output for Anki. It might be better to limit it to:
static FIRST_LINE: &'static str = "¡Si! ¡Aang ha vuelto!,Yay! Yay! Aang\'s back!,,,¡Lo sabía!,I knew it!";
...and something similar for LAST_LINE
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see some test somewhere that confirms the exact format of the csv export, so we can be sure we don't break Anki support. But yeah, let's do that in a different PR in a better way than this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that changing to your suggestion without updating cli_test_dir
will mean that we lose the test that verifies --end-at
actually had the desired effect.
tests/tests.rs
Outdated
testdir.expect_path("empty_csv/cards.csv"); | ||
testdir.expect_path("empty_csv/empty_00063_496.jpg"); | ||
testdir.expect_path("empty_csv/empty_00060_828-00066_164.es.mp3"); | ||
testdir.expect_file_contents("empty_csv/cards.csv", FIRST_LINE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the corresponding negative tests here, both expect_not_path
and expect_not_file_contents
, or something like that. Please feel free to submit a PR against cli_test_dir
with the APIs you need!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you feel about accepting this PR's tests as-is, then when I've added expect_not_path and that's been made available, I can submit a follow-up PR here using the new APIs?
I prefer multiple PRs that keep things moving forward, rather than a series of dependent PRs... but also I recognize that sometimes "later" never happens. They're both your projects, so I'll defer to whichever you'd like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I'd prefer to hold off on merging this PR until the tests are complete, because then we can both trivially verify that it works correctly without needing to hand-test it. And then I can be lazy and only make a single release of each project, too. 😉
The rest of this patch is looking really great! I'm impressed that your first Rust contributions are so clean and well-tested!
No, thank you for having tests already in the project, so it was easy to TDD this! I've never even looked at Rust before, so I was glad to have an "easy" entry point. :) I agree with all your suggestions, but my free time is unpredictable, so I'm not sure when I'll get back to this. Thanks for replying quickly, though, even if I can't follow up as quickly! |
For the record, I don't think you're being "nitpicky" at all! Keeping codebases clean, consistent, and well-factored is important. And since I'd never even looked at Rust or your project before, there are bound to be things I need to fix up. :) |
} | ||
_ => {} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a clean way in Rust to extract this 3x duplicated code block? The trick is that it uses break
it stop iterating early, which a side-effect-y helper function usually can't do. But if I could get rid of the duplication, that would make me happier and the code easier to maintain. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe an if let
statement might allow you to drop the _ => {}
at least?
fn main() {
let v1 = Some("world");
if let Some(v2) = v1 {
if v2 == "world" {
println!("Hello, world!");
}
}
}
Another option (probably a better one) is to move the duplicated code into a method on exporter
, so you can just:
if exporter.finishedBefore(seconds) {
break;
}
Or even more ambitiously:
if !exporter.shouldExportAt(seconds) {
next;
}
This version would also work for a --start-at
option, which is inevitably going to be added once --end-at
exists. :-)
Also, as a general rule for substudy
, it's often worth replacing println!
(when useful) with debug!
. This will hide the extra output unless the user sets RUST_LOG=debug
in the environment, but I generally prefer commands that don't talk much as long as they're doing what you asked them to do. ;-) But I do like generous use of debug!
and trace!
; it's proven extraordinarily useful in production. The log
+ env_logger
crates are pretty awesome for something so simple.
I've updated |
Just as a followup, I think this is almost ready to merge. IIRC, it just needs some tests to prove that it honors I've added the necessary APIs to |
Hello, I'm putting some more work in on |
This could still be merged if somebody were interested in cleaning up the merge conflicts, but the affected code will be heavily rewritten soon. |
Addresses (my own) feature request #14.
Added
--limit=<seconds>
flag to csv export. Only the first seconds will be processed, then substudy will stop early. Useful when testing a new file or subtitle timing.Added tests to confirm that the generated csv contains expected data.
Bumped cli_test_dir to version 0.1.2 to take advantage of new
expect_file_contents
andexpect_contains
methods.