Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support new backup format #14

Open
khimaros opened this issue Feb 22, 2022 · 24 comments · May be fixed by #16
Open

add support new backup format #14

khimaros opened this issue Feb 22, 2022 · 24 comments · May be fixed by #16

Comments

@khimaros
Copy link
Contributor

with the update to android 12, seedvault has also bumped the metadata version.

it would be great to be able to use this tool with newer backups.

@khimaros
Copy link
Contributor Author

@chirayudesai says:

There's a doc in the storage section which this format shares some things with
otherwise it should all be seedvault-app/seedvault#327

@grote says:

mostly this with a different key id: https://github.com/seedvault-app/seedvault/blob/android12/storage/doc/design.md
I think there's a python version of tink to make your life easier.
The k/v backup is a zipped DB now, I think there's no docs on this.

here's the tink library: https://pypi.org/project/tink/

@khimaros khimaros changed the title add support for metadata version 1 add support new backup format Feb 22, 2022
@khimaros
Copy link
Contributor Author

khimaros commented Mar 4, 2022

seedvault-app/seedvault#383 will be helpful for testing this change

@khimaros
Copy link
Contributor Author

khimaros commented Jan 30, 2023

i have access to a reference backup now and am working on a python-tink based decryption tool.

traveling for the next few weeks, so not a ton of time to allocate, but making slow steady progress.

@ladar
Copy link

ladar commented Feb 5, 2023

@khimaros please add me t o list of people waiting for an updated tool. I accidently erased several people from my contacts. I'm hoping I can extract just those contacts from backup once it is decrypted.

The only other alternative I cxan think of would involve backing up my device, restoring the seed vault I've preserved, extracting the missing contact records, and followed by a restoration of the backup I made at the beginning. But this doesn't seem like a great option given a) the time involved, and b) the likelihood that something will go wrong.

@Adri-Fa
Copy link

Adri-Fa commented May 16, 2023

Did you ever managed to do this? If not, have you anything to help me?

@nettnikl
Copy link

working on a python-tink based decryption tool

Hey @khimaros, could it be an option to share your current progress? I think there are many people interested in this, maybe we could help with testing.

@jackwilsdon
Copy link

I've just finished writing a tool to extract v1 backups - it's available here: https://github.com/jackwilsdon/seedvault-extractor

Feel free to open an issue or start a discussion if you need any help!

@khimaros
Copy link
Contributor Author

@jackwilsdon this is excellent! thank you for your efforts! i'll test this out.

what are the main challenges for supporting KV extraction?

@jackwilsdon
Copy link

jackwilsdon commented May 17, 2023

I think implementing KV support should be pretty straightforward - it appears that it might just be an encrypted gzip'd SQLite database looking at KVRestore.kt. I'll see if I can find some time over the next few days to add it.

@jackwilsdon
Copy link

jackwilsdon commented May 17, 2023

It ended up being simple enough to dump the SQLite database to disk, so I've gone ahead and implemented it in jackwilsdon/seedvault-extractor@e2875f7. Ideally it'd be exported in a more user-friendly format (JSON?), but this is at least a start.

@khimaros
Copy link
Contributor Author

khimaros commented May 17, 2023

@jackwilsdon dumping the sqlite database seems adequate to me! it's easy enough to use other tools to export sqlite into a csv, json, or other format.

i tested that changelist against a reference backup provided by the seedvault team and it seemed to work! at least, i was able to select some rows from kv_entry and some of them were human readable.

the reference backup i'm referring to was originally uploaded to a git repo with restricted access. @chirayudesai -- is it okay if i upload a tarball and link it here? i think it would be generally useful to others.

@chirayudesai
Copy link

Thanks @jackwilsdon , glad to see this!

the reference backup i'm referring to was originally uploaded to a git repo with restricted access. @chirayudesai -- is it okay if i upload a tarball and link it here? i think it would be generally useful to others.

I'd prefer it not be, it likely doesn't have any PII but I'd rather be safe than sorry.

What we can do is just create a backup from an emulator and that should be ok to share.

The git repo idea also didn't work because I tried to put some data on it and that quickly exceeded GitHub LFS limits, but maybe we can use releases.

@jackwilsdon
Copy link

I'm setting up a new device on LineageOS 20 with Seedvault to try and diagnose jackwilsdon/seedvault-extractor#2 - I'm happy to upload a new backup here once I've confirmed it is valid.

@khimaros
Copy link
Contributor Author

@jackwilsdon that sounds great! i'd be happy to help build a test harness that compares the golden data to the data extracted by your tool. so if you can grab both the full sdcard contents as well as the seedvault backup, that will be very helpful!

@jackwilsdon
Copy link

Backup (including storage): SeedVaultAndroidBackup.zip
Recovery code: recipe bean exercise lift brother design front mystery convince physical country dust
The only item in storage is a photo at Pictures/jackwilsdon.png. I'm unable to test extracting this as https://github.com/jackwilsdon/seedvault-extractor does not yet support extracting storage backups.

@khimaros
Copy link
Contributor Author

@jackwilsdon this is a great start, but hard to write tests against this unless we also have the golden scard data that the backup was generated from (everything in the emulated storage device). not urgent, but would be helpful for development i reckon.

@jackwilsdon
Copy link

Would an adb pull /storage/emulated/0 be good enough? I can reset the phone later and generate a new backup and pull the complete storage using that command if that's more useful? I guess I'll have to put some files in storage first, as otherwise I don't think Seedvault will actually back anything up.

nettnikl added a commit to nettnikl/seedvault_backup_parser that referenced this issue May 18, 2023
Include link to related project, to clear up misexpectations as in tlambertz#14
@chirayudesai
Copy link

Would an adb pull /storage/emulated/0 be good enough?

I actually did adb root and then adb pull /data

I can reset the phone later and generate a new backup and pull the complete storage using that command if that's more useful? I guess I'll have to put some files in storage first, as otherwise I don't think Seedvault will actually back anything up.

Yes, I pushed a couple random images, took some screenshots (including of the recovery code :D), etc.

@jackwilsdon
Copy link

jackwilsdon commented May 18, 2023

After a fresh reset, I took a few screenshots as you suggested and backed up apps and storage.

Here is /data (without dalvik-cache as it made the zip too large for GitHub): data.zip
And the recovery code: dove little pact broom inform cousin club stock remember debate hobby describe

The backup can be found at media/0/.SeedVaultAndroidBackup.

I've tested restoring this backup after another reset and it appears to restore settings and files just fine 👍

@crass
Copy link

crass commented May 12, 2024

@jackwilsdon this is a great start, but hard to write tests against this unless we also have the golden scard data that the backup was generated from (everything in the emulated storage device). not urgent, but would be helpful for development i reckon.

I don't think we really need the unencrypted data (if the decryption is verifying authentication tags, as it should). When using tink's streaming_aead an exception will be thrown if the decrypted data does not match the original data. Of course, this assumes a correct implementation of this in tink, but I think that's a reasonable assumption. The current implementation for V0 backups should also be authenticating because it does decrypt_and_verify with the authentication tag and so should throw an exception if the decrypted data is different than the original data. The perhaps tricky part with out the original unencrypted data is to verify that everything is being decrypted. This can be checked by subtracting the set of encrypted backup files with the set of files used for decryption.

@crass
Copy link

crass commented May 12, 2024

Backup (including storage): SeedVaultAndroidBackup.zip
Recovery code: recipe bean exercise lift brother design front mystery convince physical country dust
The only item in storage is a photo at Pictures/jackwilsdon.png. I'm unable to test extracting this as https://github.com/jackwilsdon/seedvault-extractor does not yet support extracting storage backups.

PR #16 is able to fully decrypt this backup. There are actually 3 files in the storage: Pictures/jackwilsdon.png (199385 bytes), Pictures/.thumbnails/.nomedia (0 bytes), and Pictures/.thumbnails/.database_uuid (36 bytes). This is a decent example of a backup because the files are not encrypted into 1 file each, but all three combined into 1 zip file. A better example backup would also have a larger file that is split into multiple chunks. But I suppose that might be more difficult to host here with github's file size limits.

@ladar, @Adri-Fa, @nettnikl: Check out this PR and see if it works for you.

@nettnikl
Copy link

Hey, have not gone fully through in detail, as its quite the bump in functionality, but what ive seen looks good. Sorry for being a bit paranoid, but can we maybe use known (used in public, hashsum known) test images? The recent xz security issue has shown again how risky any blobs, even just in unit tests, are. Maybe just the Lenna image from wikipedia or something similar? What do you say @khimaros ?

@crass
Copy link

crass commented May 12, 2024

Hey, have not gone fully through in detail, as its quite the bump in functionality, but what ive seen looks good. Sorry for being a bit paranoid, but can we maybe use known (used in public, hashsum known) test images? The recent xz security issue has shown again how risky any blobs, even just in unit tests, are. Maybe just the Lenna image from wikipedia or something similar? What do you say @khimaros ?

Perhaps I misunderstand you or I'm misunderstood. I understand you to be worried about the test backup posted by @jackwilsdon. To be clear I don't really care about people testing my code against that. I've already done it and am confident that its fine. One the otherhand, because any test backup will be encrypted, how do you know for sure what's inside before decrypting it? Even if someone else says is good?

I'm more interested in people testing against there own backups to catch any missed corner cases (eg it looks like @khimaros may have found one as mentioned in the PR). As far as my code it concerned there are no binary blobs introduced and I never execute any code from the backup, so the xz-style attack doesn't really apply. All my code is python source, so for there to be an exploit issues they would have to take advantage of bugs in the Python interpreter or dependencies, which would likely required 0-days. Please correct me if you meant something else.

@nettnikl
Copy link

Hey @crass , sorry, i was a bit unclear. I meant the proposal to have proper unit-testing. That would need blobs included in the repo, exactly like in the xz error. Not talking about the manual testing, though, im completely wirh you on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants