Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZigZag PoRep #77

Closed
13 of 23 tasks
porcuquine opened this issue Jul 3, 2018 · 3 comments
Closed
13 of 23 tasks

ZigZag PoRep #77

porcuquine opened this issue Jul 3, 2018 · 3 comments
Labels

Comments

@porcuquine
Copy link
Collaborator

porcuquine commented Jul 3, 2018

ZigZag PoRep

The one true proof of replication (for now and through initial release, we think).

Current strong focus is on a viable ZigZag PoRep which meets both scaling and security requirements. This will require many optimizations, but ultimately boils down to squeezing better performance (less CPU and less RAM per GiB replicated) and supporting very large sectors.

This issue will serve as a roadmap for that work. The high-level outline below should be broken out into sub-issues. Those working on these issues need to see this work as high-priority, with an emphasis on fast iteration.

Our goal now is to implement a fully viable Proof-of-Replication. This has two broad categories:

Correctness

There is one item in the list below ('Implement challenge derivation correctly') which needs to be updated to reflect current needs (TODO: @porcuquine). However, this is an easy change and low priority for the short-term push. #404 is also a hanging thread but does not block the below.

Performance

PoRep performance relates to our ability to simultaneously meet security and scaling requirements. Good news: we have now run a secure Proof-of-Replication (i.e. sufficient challenges with full parameters). For the moment, I omit most supporting documentation, but that is being worked on in one work stream below. Together, this work represents one (and for now our best) answer to the question posed in #157.

We now need to overcome two sequential hurdles:

  • Meet scaling requirements with respect to proof size. Based on data from our secure PoRep run, we estimate this to mean we need to replicate and prove a 64GiB sector. On a machine with enough RAM, that should not be an issue. However, 'should' does not always predict reality. Larger sectors #522 may correct the problem, but we should be ready to route around any subsequent obstacles.
  • Meet scaling requirements with respect to CPU time. This will be harder because, as it turns out, the time required to generate our circuit proofs is significant enough that it creates even greater pressure than proof size does. We will tackle this in several ways:
    • Maximize replicable sector size by minimizing required required memory. See issue [TODO: copy bullets into issue @porcuquine]
    • Minimize CPU time of replication. See issue [TODO: copy bullets into new issue @porcuquine]
      • Optimize a minimal-parallelism path with minimal cpu usage (also focused on memory efficiency). [TODO: @porcuquine make issue, coalesced with above]
    • Sealing with Blake2s (was: Hybrid Merkle Trees) #531: Pending calculations verifying a projected solution to the performance equation, implement a hybrid hashing solution for merkle trees. The idea is to allow a tunable public parameter (a tree depth) specifying the portion of the tree which should use blake2s instead of pedersen hashes. This will allow us to trade proof size for speed. This is exactly the lever we need to take advantage of our surplus proof-size budget and apply it to cpu-time performance.

In support of these optimizations, we need to have (and start using) a better way to measure and track change.

Because the above (non-exhaustive) list includes multiple overlapping and potentially conflicting optimizations, we need a better configuration story, as sketched in #501. Once implemented, optimizations should integrate and ensure the defaults for go-filecoin allow the nightly and user devnets to continue functioning, while also making it easy to configure benchmarks (especially the ZigZag 'example') for the properties required to both acquire data on the way to and ultimately run a first fully secure, fully scalable proof of replication.

The trajectory of this ongoing work needs to be plotted and adjusted in ongoing benchmarks and experimental runs. In support of this tactical work, and as the first step in a larger project, we will deliver a presentation of key data in a form supporting our calculations. This should eventually serve as documentation of the relationship of many key Filecoin parameters, as well as provide a relationship to hardware requirements. [TODO: @porcuquine make placeholder cryptolab epic].

#477


Historical work, with a few stragglers:

  • Factor Layers trait out of extant layered_drgporep implementation.
  • Introduce 'expander' component to graphs.
  • Implement zigzag graph toggle.
  • Add ZigZagDrgPorep, implementing Layers.
  • Implement Circuit
  • Make zigzag_test_compound pass: for some reason verification fails, though inputs seem correct for generated circuit proof.
  • Multiple challenges
    • Add support for multiple challenges.
    • Add challenge-derivation method. (Stub exists in API)
    • Implement challenge derivation correctly.
    • Only generate replica-parent merkle proofs for 50% of challenges.
  • Don't prove data inclusion except on first layer. (Prove identity to previous layers, replicated data.) Not applicable with challenges varying per layer.
  • Add vector commit for each layer's CommR.
  • Integrate generated parameter cache with CircleCI caching.
  • Generate proof while replicating. Can't do this.

@porcuquine porcuquine added this to the Sprint 17 milestone Jul 3, 2018
@porcuquine porcuquine mentioned this issue Jul 6, 2018
4 tasks
@porcuquine porcuquine added the Epic label Jul 6, 2018
@porcuquine porcuquine removed this from the Sprint 17 milestone Jul 11, 2018
@nicola
Copy link
Contributor

nicola commented Dec 3, 2018

Re-using this Epic to track work for zigzag to be done for stage 3, added some issues to it

@dignifiedquire
Copy link
Contributor

no more zigzag for now

@jon-chuang
Copy link

jon-chuang commented Apr 29, 2020

Hi, I would like to ask if ZigZag PoRep was rolled back due to unsuitability for PoSt due to SEAL stacking attack. If so, I am still interested in it as a means for PoRep on network-controlled data without PoSt-based elections etc. Is there anyone I can contact regarding this?

I have many questions, especially surrounding how long I can make the lockout time, whether this can be made arbitrarily long by stacking many layers. If one can make timeouts take 1 hour, say by stacking 1 million layers, with initial sealing time of say 6 hours, this would be ideal.

I see, @porcuquine , that in your blogpost one can achieve 24 hour timeout. This would be amazing for us. Can I ask if this still holds? Further, what is the initial sealing time? I expect it to be several days then?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants