Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extraction error #8

Open
Devyatyi9 opened this issue Aug 15, 2022 · 5 comments
Open

Extraction error #8

Devyatyi9 opened this issue Aug 15, 2022 · 5 comments

Comments

@Devyatyi9
Copy link

Devyatyi9 commented Aug 15, 2022

I've got some errors when a was trying extract usm file and even use command 'probeusm'. File has been created via Scaleform VideoEncoder (CRI Medianoche [NR] Ver.1.70.00 - 2013-03-08) with sample video and subtitles. And video correctly works in game. (Although I had to change the value of the language id in the subtitle chunk so that they are displayed)
I didn't find any mention of the @SBT tag (signature) in the log file.
test_1_usm.usm_NSi.log
test_1_usm.zip
screenshot 1
screenshot 2

@Devyatyi9
Copy link
Author

Devyatyi9 commented Aug 15, 2022

Also checked file from Honkai Impact 3rd, same error.
1.7_CG01_mux.usm_KYe.log.zip
Maybe this will help with information - description

@Youjose
Copy link

Youjose commented Aug 17, 2022

WannaCri is missing some tags related to USM chunks, and overall the extraction method can be made better. Since the header basically includes all type chunks included in the USM within the CRID part.

I checked your profile and as you know, the SBT format is not particularly specific, the language ID's can be arbitrary, and won't always match with Honkai labeling. I would assume parsing those back to SBT format is quite easy after extraction to SRT, from datetime -> milliseconds basically.

As for extracting those if you want, you can use my tool which will extract the SBT tag, as well as convert it to SRT for all available languages, although I couldn't find a way to decide what the language ID specifically is, as it's not consistent and the value is most likely arbitrary. However it's not as easy as WannaCri's, you would need some python experience.

Although an improvement for if they still want to hardcode the Tag types, is to add all of those:

class USMChunckHeaderType(Enum):
    CRID  = b"CRID" # Header.
    SFSH  = b"SFSH" # SofDec1 Header?
    SFV   = b"@SFV" # Video (VP9/H264/MPEG).
    SFA   = b"@SFA" # Audio (HCA/ADX).
    ALP   = b"@ALP" # Rare. (Alpha video information.)
    CUE   = b"@CUE" # Rare. (Unknown.)
    SBT   = b"@SBT" # Rare. (Subtitle information.)
    AHX   = b"@AHX" # Rare. (Ahx audio file? Used for SofDec1 only?)
    USR   = b"@USR" # Rare. (User data?)
    PST   = b"@PST" # Rare. (Unknown.)

Although as I said, probably better to reserve this for building, for extraction, the CRID part says it all.

Also for @UTF parsing, I couldn't help but notice that the indexes might be wrong in WannaCri, there's a conflicting usage for those across tools, but I did found using the even index for unsigned type values is correct as otherwise it would give us some incorrect values in some CPK's/ACB's. Although, I am not sure myself.

@Devyatyi9
Copy link
Author

Devyatyi9 commented Aug 18, 2022

@LittleChungi Thanks!

what the language ID specifically is

It's subtitle track.
Earlier, before I found out about this program, I decided to call it lang id.
2022-08-18_08-26-03

@Youjose
Copy link

Youjose commented Aug 19, 2022

Yeah I saw this in the SDK, the numbers are arbitrary sadly, so what is specified for a certain game is not universal. Glad I saw an example of this chunk here in use though, helped me write the SBT -> SRT converter for my lib.

@donmai-me
Copy link
Owner

Thanks for taking the time in reporting this and also thank you to LittleChungi for the helpful information. I don't think I can currently add support for subtitles and all the other types of USM chunks, mostly because I lack the time and information needed. However, if you're only interested in extracting the video and audio data then I can make the other USM chunks act like stubs in the meantime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants