Skip to content

IPFS API with verification in enclave #70

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
brenzi opened this issue Dec 22, 2019 · 59 comments
Closed

IPFS API with verification in enclave #70

brenzi opened this issue Dec 22, 2019 · 59 comments
Milestone

Comments

@brenzi
Copy link
Collaborator

brenzi commented Dec 22, 2019

implement an API to ipfs which enables the STF to:

  • fetch a particular ipfs uri.
    • the content can be fetched by the untrusted worker, but the content multihash shall be verified inside the enclave
  • (optional) write content to ipfs

Gitcoin Bounty

design approach

  • please see: "a first version" comment below. not the most elegant, but pretty straight-forward.
  • Alternatively, you may suggest better approaches (like i.e. querying the IPFS node directly from within the enclave - might be less effort and more elegant (caveat: rust-ipfs-api doesn't support no_std yet!).

Please describe your preferred approach in this thread prior to implementing it!

dev environment

You can use our docker to emulate SGX in SW mode, so you don't need SGX enabled HW for this task

acceptance tests:

  • CI test passes
  • unit tests covering all introduced functions pass. in particular:
    • enclave STF queries IPFS content and fails if content isn't correct, passes if correct (matching IPFS hash)
    • must also work for contents larger than 256kB
  • integration test against real IPFS node passes (with similar tests to above)
@brenzi brenzi added this to the FUTURE milestone Jan 30, 2020
@brenzi
Copy link
Collaborator Author

brenzi commented Jan 31, 2020

helpful resources:

How IPFs hash is created out of a DAG of 256kB chunks:
https://medium.com/swapynetwork/generating-ipfs-multihash-offline-2edb2618b93b

IPFS node in rust:
https://github.com/ipfs-rust/rust-ipfs

IPFS http API in rust:
https://github.com/ferristseng/rust-ipfs-api

@brenzi
Copy link
Collaborator Author

brenzi commented Jan 31, 2020

A first version could do the following:

  1. run standalone IPFS node
  2. enclace can tell worker to fetch a file from IPFS synchronously
  3. worker fetches file via http request to IPFS node and stores it to disk (file name = IPFS hash)
  4. when the host-call returns, the enclave reads the file from disk into memory
  5. the enclave recreates the IPFS hash from the file contents and verifies that it matches the expected hash it requested

fetching IPFS content is already implemented. However, passing the content as a function return value might not scale to larger files. Moreover, the contents is not verified based on its ipfs hash

@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


This issue now has a funding of 400.0 DAI (400.0 USD @ $1.0/DAI) attached to it.

@gitcoinbot
Copy link

gitcoinbot commented May 1, 2020

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work has been started.

These users each claimed they can complete the work by 266 years, 5 months from now.
Please review their action plans below:

1) developerfred has applied to start work (Funders only: approve worker | reject worker).

I already have previous experience with substrate and ipfs i would like to work on this issue.
2) whalelephant has been approved to start work.

Hi,

I think I would follow the first version approach (since not comfortable / not sure how much work is required for using no_std for the rust-ipfs-api).

High level work plan:

  • environment set up (to get familiar with substraTEE and the docker SGX)
  • understand /+ solve the limitations of ocall_read_ipfs on larger content (>256kB)
  • worker write file to disk and enclave reads it into memory
  • enclave create DAG from the chunks in memory and create the multiHash
  • decode CID to multiHash to verify

Learn more on the Gitcoin Issue Details page.

@brenzi
Copy link
Collaborator Author

brenzi commented May 4, 2020

@x5engine: could you please share your brief work plan? How would you approach this?

@Web3Foundation
Copy link

@brenzi @x5engine Any updates regarding working on this issue?

@x5engine
Copy link

Hey guys, I will send implementation I did after tomorrow, we made a good implementation for this.

@brenzi
Copy link
Collaborator Author

brenzi commented May 27, 2020

@x5engine please see above: you haven't been assigned yet, we asked you for a sketch on how you will approach this. We'd like to review your design before we approve

@x5engine
Copy link

x5engine commented May 29, 2020

hey @brenzi sorry for the confusion, well I would plan to query the ipfs Node directly to get the has and save the file to the system then when requested I'd check the checksum of the file with the metadata checksum.

so, in other words, it should compute the checksum prior to uploading it

here something like it filecoin-project/venus#1925

@brenzi
Copy link
Collaborator Author

brenzi commented Jun 3, 2020

@Web3Foundation please assign @x5engine to work on this issue. The work plan is not very specific yet, but a more detailed design should be paid work I think.

@x5engine: You mentioned "uploading". Be aware that our task asks for a read interface, not a write interface. We want to fetch IPFS content from within the enclave STF. In order to avoid frustration, please provide us with a more detailed design first before implementing it. Where in our code would you do what?

@x5engine
Copy link

x5engine commented Jun 6, 2020

Hey, @brenzi @Web3Foundation thank you for approving me,

So here are two strategies to get this implemented:
Following your suggestion

  1. On IPFS node fetch a file by CID
  2. Save File to the system
  3. get the enclave to recreate the has from the content using https://medium.com/swapynetwork/generating-ipfs-multihash-offline-2edb2618b93b
    but I think this approach is very slow even if it is straightforward since there is no need to save the file to generate the hash and verify it again, so skipping the save step would be very helpful

The other Suggestion is verifying the file inside the enclave with either fetching the file inside it with ipfs HTTP API then generate the hash from the content fetched or we can make use of https://github.com/brenzi/rust-ipfs-verifier (which you made @brenzi ) no even downloading the file is better (but might be needed in your requirement)

But a file read might be very costly when handling large files...

Here is the Implementation of the CID in Rust:

https://github.com/multiformats/rust-cid

using this we can add test to check fetched data against their CID in the enclave:

    let data = cid.to_bytes();
    let out = Cid::try_from(data).unwrap();

    assert_eq!(cid, out);

So tell me which approach you want to do?

Note: keep in mind there are differences in CID version and how they get generated,

@brenzi
Copy link
Collaborator Author

brenzi commented Jun 7, 2020

My preference would be to use IPFS node http API directly without storing the content to disk. You might find inspiration in my outdated rust-verify-ipfs crate, but it isnt more than a code sample. Also, it assumes content is <256kB which you can't assume for this task.
concerning cid versions: as long as we can put content with the recent ipfs client and read that from our stf thats fine. Just document your choices with an example

@brenzi
Copy link
Collaborator Author

brenzi commented Jun 9, 2020

this might be interesting for you
w3f/General-Grants-Program#283 (comment)

@x5engine
Copy link

My preference would be to use IPFS node http API directly without storing the content to disk. You might find inspiration in my outdated rust-verify-ipfs crate, but it isnt more than a code sample. Also, it assumes content is <256kB which you can't assume for this task.
concerning cid versions: as long as we can put content with the recent ipfs client and read that from our stf thats fine. Just document your choices with an example

Awesome, so we should go with this option and I can work on what you have on the rust-verify-ipfs crate and make it work with any size we need, for the version i think we might be good and maybe handle the two versions.

this might be interesting for you
w3f/General-Grants-Program#283 (comment)

This is so good :) I guess I can apply or use the work I am doing here there too? how is it done?

btw do you have a discord channel? might be faster for communication.

@brenzi
Copy link
Collaborator Author

brenzi commented Jun 10, 2020

meet us on riot: https://riot.im/app/#/room/#substratee:matrix.org

If you can re-use the code you write here for added functionality on substrate, I would guess that's be very welcome

@gitcoinbot
Copy link

@x5engine Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

  • reminder (3 days)
  • escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

1 similar comment
@gitcoinbot
Copy link

@x5engine Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

  • reminder (3 days)
  • escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

@x5engine
Copy link

hey @gitcoinbot I am still working on this and will respond tomorrow with all I have, thanks

@gitcoinbot
Copy link

@x5engine Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

  • reminder (3 days)
  • escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


@x5engine due to inactivity, we have escalated this issue to Gitcoin's moderation team. Let us know if you believe this has been done in error!

  • reminder (3 days)
  • escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

@brenzi
Copy link
Collaborator Author

brenzi commented Jun 23, 2020

@x5engine You have showed no single line of code so far even after several reminders. You also haven't asked any questions that would indicate that you're actually working on this. I have to ask @Web3Foundation to take this issue away from you

@Web3Foundation
Copy link

@brenzi taking @x5engine off this issue as there has been no response in quite some time, re-opening to other contributors.

@Web3Foundation
Copy link

@brenzi, @whalelephant also applied to work.

@brenzi
Copy link
Collaborator Author

brenzi commented Jul 6, 2020

@whalelephant are you still interested? How would you approach the task and on what timeline?

@whalelephant
Copy link
Contributor

Hi @brenzi

Yes still interested. Copied from the my application on gitcoin:

I think I would follow the first version approach (since not comfortable / not sure how much work is required for using no_std for the rust-ipfs-api).

High level work plan:

  • environment set up (to get familiar with substraTEE and the docker SGX)
  • understand /+ solve the limitations of ocall_read_ipfs on larger content (>256kB)
  • worker write file to disk and enclave reads it into memory
  • enclave create DAG from the chunks in memory and create the multiHash
  • decode CID to multiHash to verify

I have also applied to do Hackusama so I think for this bounty will be 4-6 weeks, but do let me know what you have in mind

@gitcoinbot
Copy link

@whalelephant Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

  • reminder (3 days)
  • escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


@whalelephant due to inactivity, we have escalated this issue to Gitcoin's moderation team. Let us know if you believe this has been done in error!

  • reminder (3 days)
  • escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

1 similar comment
@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


@whalelephant due to inactivity, we have escalated this issue to Gitcoin's moderation team. Let us know if you believe this has been done in error!

  • reminder (3 days)
  • escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

@brenzi
Copy link
Collaborator Author

brenzi commented Jul 27, 2020

@whalelephant Do we have to adjust the problem statement due to your reasoning on unixfs not implemented for no_std? May I ask you to conclude your current challenges and possible ways to move forward here?

@whalelephant
Copy link
Contributor

The initial implementation outlined was that chunks of data themselves form the multihash of the CID was incorrect. The current understanding of the multihash part of the default leave (unixfs) is the hash of a Unixfs struct with ipfs metadata, option for links to other leaves, option for data.

The challenge is that unixfs lib does not currently have no_std to work in the enclave rs-ipfs/rust-ipfs/#247.

If the file is added using the raw arg, then the multihash should be the hash of the data in raw bytes with prefixed hashing algo. In this case, we can perhaps follow the links of the root leave (the CID we get back from the /add) and its children and their children etc to collect the CID(v1) for the raw leaves and verify them against the data we have written to the disk sliced into chunks.

This is my understanding at the moment. Will be grateful for feedback

@brenzi
Copy link
Collaborator Author

brenzi commented Jul 28, 2020

As we mainly need to read ipfs content that we publish ourselves it would be acceptable to use some specific ipfs options to make our lifes easier at the cost of loss of generality. Do you see a shortcut for this?

@whalelephant
Copy link
Contributor

whalelephant commented Jul 29, 2020

I don't think this is a shortcut but I think we can use ipfs api such as ipfs refs to traverse the tree to get all links and then only compare the hashes of raw leaves. In other words, instead of constructing the DAG with data in the enclave, we give the enclave the fetched data and an array of links, and the enclave splits and hash the fetched data to compare with that array.

@brenzi
Copy link
Collaborator Author

brenzi commented Jul 29, 2020

but then the enclave would need to trust this list of links, right? This wouldn't be acceptable.
We might want to look into the unixfs crate and see how hard it may be to support no_std....

@whalelephant
Copy link
Contributor

ah ok, that makes sense, that would be placing trust outside of the enclave. Will have a look🤓

@gitcoinbot
Copy link

@whalelephant Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

  • reminder (3 days)
  • escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

1 similar comment
@gitcoinbot
Copy link

@whalelephant Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

  • reminder (3 days)
  • escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


@whalelephant due to inactivity, we have escalated this issue to Gitcoin's moderation team. Let us know if you believe this has been done in error!

  • reminder (3 days)
  • escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

1 similar comment
@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


@whalelephant due to inactivity, we have escalated this issue to Gitcoin's moderation team. Let us know if you believe this has been done in error!

  • reminder (3 days)
  • escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

@whalelephant
Copy link
Contributor

looks like there has been work on it in rust-ipfs :)

@brenzi
Copy link
Collaborator Author

brenzi commented Sep 4, 2020

@whalelephant : I was able to complete all acceptance tests in docker and you have fulfilled the problem statement entirely.
It was worth waiting for this solution. Thank you! Will merge your PR

@Web3Foundation please reward @whalelephant with the stated bounty.

brenzi pushed a commit that referenced this issue Sep 4, 2020
this implements gitcoin bounty #70 
Co-authored-by: bwty <[email protected]>
@brenzi
Copy link
Collaborator Author

brenzi commented Sep 4, 2020

@whalelephant if you need endorsement to push your changes on dependencies upstream, I'd be happy to support. The goal should really be to drop your forks and get all your no_std stuff upstream

@whalelephant
Copy link
Contributor

@brenzi that would be great, currently still waiting for feedback / review from the base 3 PRs, then we need to get the cid crate and the ipfs-unixfs merged after them:
OrKoN/base-x-rs#14
multiformats/rust-multihash#81
multiformats/rust-multibase#25

@brenzi
Copy link
Collaborator Author

brenzi commented Sep 8, 2020

@whalelephant please check riot/element. I DM'd you

@whalelephant
Copy link
Contributor

@Web3Foundation please can we process this? thanks!

@Web3Foundation
Copy link

@whalelephant thanks for the ping @brenzi this is ready to be paid out?

@brenzi
Copy link
Collaborator Author

brenzi commented Sep 11, 2020

@Web3Foundation yes it is!

@brenzi brenzi closed this as completed Sep 13, 2020
@Web3Foundation
Copy link

@whalelephant please submit your work via Gitcoin so that we can make the payout.

@whalelephant
Copy link
Contributor

@Web3Foundation I have, but gitcoin webui is having issues... I think we will just have to wait. 😞
gitcoinco/web#7451

@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work for 400.0 DAI (400.0 USD @ $1.0/DAI) has been submitted by:

  1. @whalelephant

@Web3Foundation please take a look at the submitted work:


@whalelephant
Copy link
Contributor

@Web3Foundation this is ready to be paid now. thanks!

@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


The funding of 400.0 DAI (400.0 USD @ $1.0/DAI) attached to this issue has been approved & issued to @whalelephant.

brenzi added a commit to encointer/encointer-worker that referenced this issue Nov 3, 2020
* automatic-shard-joining-and-per-shard-updates-on-block (integritee-network#160)

* [enclave] update_map contains options. This is needed if a storage value needs to be deleted in the STF
[enclave, stf] perform state updates per shard, auto join new shards

* [worker] only feed 100 blocks at a time into the chain relay. Improved logging while syncing to keep track of sync status

* [WorkerApi] Remove default protocol ws.

* update ipfs version (integritee-network#165)

Co-authored-by: bwty <[email protected]>

* Implement Ipfs read and write with verification in enclave

this implements gitcoin bounty integritee-network#70 
Co-authored-by: bwty <[email protected]>

* Encointer contributions upstreaming (integritee-network#174)

* [enclave] ! fix: init-shard if it does not exist
* back up chain relay db before update in case of file corruption
* clean up chain relay sync logging
* Ws server refactor (#13)
* changed ws_server completely. Requests from client are now handled in the worker main event loop to prevent race conditions with on state/chain_relay access.
* [ws_server] remove unwrap and send instead "invalid_client_request" to client
* updating block number in stf state
* [enclave/chain_relay] store only hashes of the headers instead of the headers themselves
* enclave: patch log and env_logger to mesalock
* worker should panic if it can't write to shard
* add public getters for unpermissioned statistics (#16)
* don't request key provisioning form other worker. assume its there or generate new (dangerous!)
* bump version to 0.6.11 like encointer reference release

Co-authored-by: clangenb <[email protected]>
Co-authored-by: Marcel Frei <[email protected]>
Co-authored-by: Christian Langenbacher <[email protected]>

* fix integritee-network#176 and update some dependiencies

* upgrade to upstream 2.0.0-rc5
fix .dispatch filtering introduced in paritytech/substrate#6318
depend on tag version for sgx-runtime

* Upgrade upstream 2.0.0 (integritee-network#182)

* enclave builds
* worker and client builds
* fix metadata module index
* successfully tested shielding-unshielding example

* fix merge. builds and demo works

Co-authored-by: clangenb <[email protected]>
Co-authored-by: bwty <[email protected]>
Co-authored-by: bwty <[email protected]>
Co-authored-by: bwty <[email protected]>
Co-authored-by: Marcel Frei <[email protected]>
Co-authored-by: Christian Langenbacher <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants