Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synthetic migration torrents #130

Open
SoniEx2 opened this issue Jul 12, 2022 · 18 comments
Open

Synthetic migration torrents #130

SoniEx2 opened this issue Jul 12, 2022 · 18 comments

Comments

@SoniEx2
Copy link

SoniEx2 commented Jul 12, 2022

Bittorrent v2 doesn't contain provisions for synthesizing v2 torrents from existing v1 torrents. This actually makes upgrading incredibly inconvenient.

It would be nice if clients could seamlessly seed "synthetic" torrents, deterministically derived from v1 torrents and their data, such that the synthetic torrents can be re-uploaded to replace the v1 torrents without losing all seeders (and without having to give existing seeders the new version torrents).

@the8472
Copy link
Contributor

the8472 commented Nov 19, 2022

This isn't generally possible since v2 requires aligned files while v1 does not by default (the padding required to align files is an optional extension). Additionally calculating the hashes requires access to the files, it can't be calculated from a torrent, so it wouldn't help people who start downloading a torrent.

Since only seeds can participate in such a torrent it wouldn't be particularly useful.

@SoniEx2
Copy link
Author

SoniEx2 commented Nov 19, 2022

it would be more useful than you think, especially for near-future applications. (maybe you don't want v1 torrents in your app, but you know ppl are gonna try to use them anyway, so you make a compromise and automatically convert to v2 whatever they download as v1. and if someone else tries to do it to the same torrent, they merge together since they're both created with the same algorithm.)

this also means we don't need to care about (un)padding as long as it's consistent.

@the8472
Copy link
Contributor

the8472 commented Nov 19, 2022

You didn't really explain under which scenario it would provide an improvement.

@SoniEx2
Copy link
Author

SoniEx2 commented Nov 19, 2022

  1. make social network (standalone app)
  2. allow torrent upload
  3. download locally and convert to v2
  4. if someone else does it with the same v1 torrent they join the same v2 swarm/infohash/etc and share the same v2(-only?) torrent to their followers
  5. in-network torrents are only downloadable if they're v2, only out-of-network torrents can be v1

@the8472
Copy link
Contributor

the8472 commented Nov 19, 2022

The benefit of that isn't clear. Those aren't hybrid or in any way backwards-compatible torrents, so you're just downloading content and creating a separate swarm (there won't be any sharing between v1 and v2 for most people, unlike hybrid torrents). The only thing that changes is that two people could create the v1->v2 separate swarm independently from an old torrent. But that only works if you have a v1 torrent for those two people to start with.

I don't see the significant benefit over
a) having everyone use the v1 torrent and not creating a split swarm
b) having only one person download the v1 torrent and making a v2 torrent to re-distribute

@SoniEx2
Copy link
Author

SoniEx2 commented Nov 19, 2022

the benefit is to deprecate v1 torrents entirely, because every time you re-share a v1 torrent, it becomes v2. anyone sharing that, shares a v2 torrent. it creates a one-way pathway for existing v1 torrents to become v2 torrents in the near future, even if not all of them go through the process.

it's about preservation.

@the8472
Copy link
Contributor

the8472 commented Nov 19, 2022

Yeah, no... it still doesn't make sense. On the one hand it relies on individual action (users taking a v1 torrent and turning it into a v2 torrent) and on the other hand it requires some powerful entity to steer people towards using v2 torrents. Even assuming that both of those things exist it's only a marginal improvement compared to one person taking a v1 torrent and reuploading the content as v2.

Currently there's no plan deprecating v1 entirely, for now increasing v2 support is more important. It may happen some day, but at that point old torrents might have died naturally (no more seeders) or people already migrated them.

@SoniEx2
Copy link
Author

SoniEx2 commented Nov 19, 2022

no, it relies on automated action (client converts v1 torrents into v2 torrents, copying an added torrent always copies a v2 magnet) and enables multiple clients to do it the same way (so anyone who's fully downloaded a v1 torrent automatically seeds it on the v2 side, so nobody has to carry the v2 side alone), as clients get updated they automatically start seeding the synthetic v2 versions of these torrents and everyone wins.

@DejayRezme
Copy link

Similar suggestion here.

I think to be useful, you would need a protocol enhancement that clients can optionally share or receive automatically generated hybrid torrents when they look for a v1 torrent.

So once someone rechecks a V1 torrent someone else looking for that V1 torrent can also receive a V2 hybrid torrent.

This would lead to better swarm merging and more resilient torrents over time.

@SoniEx2
Copy link
Author

SoniEx2 commented Oct 1, 2024

not particularly, just wait for users to share the torrent by clicking "copy link" on their torrent client, and turn that into V2/hybrid.

alternatively, the DHT scanners will also pick them up. maybe convince DHT scanners to prioritize V2/hybrid versions.

@DejayRezme
Copy link

DejayRezme commented Oct 1, 2024

Hmm, you mean something like bitmagnet? Yeah that could work. Or BitSearch could do this too I think.

The advantage of a protocol enhancement would be that if a widespread torrent client like qBittorrent would implement this, people wouldn't have change their behavior or do anything. But eventually everyone get the benefits of more resilient torrents.

Ideally you would create the merkle tree already while downloading, but I'm not sure if that works with the alignment of the pieces.

@arvidn
Copy link
Contributor

arvidn commented Oct 20, 2024

@SoniEx2
What's possible today is to download the v1 torrent (all its content) and then create a new v2 (or hybrid) torrent from those files. All v2 torrents should be identical and have the same info-hash (as long as nobody includes any extension fields in the info-dictionary).

As The8472 has explained; in order to create a v2 torrent you need all the content, so you can hash it. There's no way around this. What exactly are you proposing be standardized or specified to make this process better or more reliable?

@the8472
Copy link
Contributor

the8472 commented Oct 20, 2024

All v2 torrents should be identical and have the same info-hash (as long as nobody includes any extension fields in the info-dictionary).

There are degrees of freedom, such as file names and the piece length. The individual pieces root fields should be unique though.

Having a reduced canonical representation without those degrees of freedom might help with peer discovery, but that has less to do with migration.

@SoniEx2
Copy link
Author

SoniEx2 commented Oct 20, 2024

clients should make the v2/hybrid torrent automatically after finish downloading the v1 torrent files (if told to download the entire v1 torrent). then when copying from the interface, like rightclick a torrent and copy magnet link you can pass on to friends, you end up passing on the v2/hybrid torrent seamlessly.

@DejayRezme
Copy link

Or a protocol extension that allows a client to receive an corresponding hybrid V2 torrent when searching for a V1 torrent. Basically if clients implement this it forces an upgrade to V2 torrents.

@the8472
Copy link
Contributor

the8472 commented Oct 20, 2024

then when copying from the interface, like rightclick a torrent and copy magnet link you can pass on to friends, you end up passing on the v2/hybrid torrent seamlessly.

This would have a different infohash, which means you would be sending those other people to a torrent where potentially only you are the uploaded, not the other preexisting seeds for the v1 torrent.

Or a protocol extension that allows a client to receive an corresponding hybrid V2 torrent when searching for a V1 torrent

There would be no source of trust that the newly generated torrent matches the existing one. With a hybrid torrent both the v1 and v2 aspects are covered by the same infohash(es) (interpreted as v1 or v2 hashes). So if you trust the original author of the torrent you can trust that both will yield the same content. While this still has to be verified it's still much easier to check than checking each possible torrent generated by random peers, each of which might be sending different nonsense.

@SoniEx2
Copy link
Author

SoniEx2 commented Oct 20, 2024

This would have a different infohash, which means you would be sending those other people to a torrent where potentially only you are the uploaded, not the other preexisting seeds for the v1 torrent.

this is a non-issue if every client does it.

(can also do a gradual rollout by generating and seeding the v2 torrents today and only replacing the copy link action later, or even selectively replacing the copy link based on v2 swarm size)

(can also make a new magnet link extension that allows carrying a v2 magnet link inside a v1 magnet link)

@DejayRezme
Copy link

DejayRezme commented Nov 3, 2024

Another idea: Could you have a v1/v2 hybrid "sparse" torrent that only has the v1 piece data plus only the merkle root hash per file? This would still have most advantages:

  • Still can verify each file individually using the v1 piece hashes, the remainder (max 2x piece size) is covered by the merkle hash
  • Can find and merge individual files in swarm
  • Only insignificantly larger than normal v1 torrent
  • Full piece data can be regenerated after download, compatible with V2
  • Can serve v2 pieces for individual files after recheck

So an extension would allow you to search the DHT and get a list of per file sha256 merkle root hashes by providing an normal v1 info hash. Which allows you to upgrade your v1 torrent into a "sparse v1/v2 hybrid torrent". On recheck or download a client could simply generate, store and serve those.

Anna's archive kind of does this by naming all their ebook files in torrents as MD5. I wonder if there isn't an extension for that already.

I'm not sure if the size of torrent files is really any significant factor for anybody, but a v1/v2 hybrid torrent is almost 3x the size of a v1 torrent.

PS: Ok for a 1000 file torrent that is 32kb already. But 1000 files is rather rare.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants