Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated peer discovery #388

Closed
azazar opened this issue Mar 16, 2019 · 22 comments
Closed

Automated peer discovery #388

azazar opened this issue Mar 16, 2019 · 22 comments
Labels
feature request Request for a new feature or change

Comments

@azazar
Copy link

azazar commented Mar 16, 2019

I like Yggdrasil and find it quite useful. There is great idea behind it. And it's implementation is also seem much better that of CJDNS. It's too bad, that CJDNS developers decided to give up user experience and discarded autoconfiguration feature. For me public peers is the only way to join the network. Looking for peers manually, using forums and chats can and must be automated. There is a rule saying that everything that can be automated, should be automated and eventually will be automated. Manual software configuration, even with graphical UI frightens away ordinary users, and wastes precious time of everyone, who wasn't frightened away. It's a bigger problem than it seems from a first glance.

I've seen discussion about using DHT for that. But DHT doesn't look like the best choice for nearest peer discovery. There is much simpler peer discovery algorithm. Each peer can store size constrained list of nearest known peers. All peers exchange peer lists and update their own after checking latency of peers in received lists, keeping only peers with lowest latency. To have full autoconfiguration peer exchange should be enabled by default.

@azazar azazar changed the title Automatic peer discovery [feature request] Automated peer discovery [feature request] Mar 16, 2019
@cathugger
Copy link
Contributor

cathugger commented Mar 17, 2019

So basically you wanna opennet kind of connectivity in addition to current darknet style.
I can see utility in that, as I would no longer need to care about removing dead peers and ensuring stuff doesn't break as the world moves, though I'm not sure if it's that simple.
Some issues I can think of:

  • what is maximum peer count? it must be limited, otherwise it'd perform not too well.
  • if each peer prefers closest ping-wise peers, and peer count is limited, couldn't this prefer only closest community peers, essentially cutting off connectivity with rest of the world, making isolated lands?

So, if this ever gets implemented, initially it should be opt-in IMO, as there is definitely stuff to think about.
This proposal kinda reminds me of how freenet works - it has opennet mode, where peerings are automatic, and manual peerings are also allowed.

@azazar
Copy link
Author

azazar commented Mar 17, 2019

Algorithm choice depends on how peers should be chosen. I didn't see any guidelines on choosing peers, except the advice to find nearest. What is ideal peer list?

@iShift
Copy link

iShift commented Mar 17, 2019

+1

algorythm can be like this:
(if peers list in config are empty)

  1. Try local connection
  2. If no local connection found - try DHT
  3. with DHT we can try to find nearest peer (3-5) and (2-5 farest)
  4. Then try to connect through nearest to farthest if success - add peer to temporary list, this algorithm can help not to separate network to closest sub-networks.

Also, on mobile - DHT as server should be off, to save battery (by default)
On server\PC nodes DHT can by enable by default with limit - 50-100 connections

But alongside with DHT i think we should have trackers support, many local nets have "retrackers" for torrent - we can use them to find local nodes also this help if we would have browser module - in browser we can't have native DHT, but trackers works well (example - webtorrent)

@azazar
Copy link
Author

azazar commented Mar 17, 2019

  1. If no local connection found - try DHT

Why use DHT? Is there any need for hash tables?

@iShift
Copy link

iShift commented Mar 17, 2019

@azazar how you can find peers if list is empty and no local peers found? only way - DHT

@azazar
Copy link
Author

azazar commented Mar 17, 2019

@iShift DHT is a hash table, it's purpose it different. The are plenty options for discovering peers besides DHT. DHT can be used for a bootstrap and any other random rendezvous, but I don't know if it's a good choice for building peer list for Yggdrasil node. It can be employed to find random nodes that relevant to some sort of hash, but not geographically or routing relevant.

@Mikaela
Copy link

Mikaela commented Mar 17, 2019

I didn't see any guidelines on choosing peers, except the advice to find nearest. What is ideal peer list?

This is one of the issues of the homepage, yggdrasil-network/yggdrasil-network.github.io#31 (comment) and the advice is three to four would be a good amount and they should be located as close as possible. It's recommended that public peers peer with each other.

I think ideally you would have a local meshnet and autopeer.

how you can find peers if list is empty and no local peers found? only way - DHT

DHT is not a magic solution either, I think all implementations need bootstrap nodes and those are often blocked (like IPFS in China) and mostly the implementations (Bittorrent, IPFS) don't remember old peers after restart. I cannot find a link here, but the article was about Russian internet shutdown and seeing if decentralized solutions are ready for it.

I think there are also potential privacy concerns like said in peer announcing through nodeinfo #347 (comment).

@neilalexander
Copy link
Member

So this is something that has been discussed a number of times in the Matrix channel, and ultimately we tend to arrive at the same point.

One thing to understand about Yggdrasil is that it is designed to be link-agnostic - it should work anywhere, whether that's a point-to-point Ethernet cable, cellular network, a wireless link of some kind, over LANs or the Internet. Our project goal is that Yggdrasil should be able to handle all of these network conditions simultaneously and automatically.

When we talk about building automated peer discovery into Yggdrasil, this means that we are building in a dependency on the Internet, as global automatic peer discovery will not typically work on networks that can't reach the Internet. This is contrary to the project goals and it may create some expectations that may turn out to be true when used over the Internet but false or misleading elsewhere.

That said, we understand that people want to automate the way that peers can be added and removed at runtime, hence why we have the admin socket. Therefore our position usually has been that if someone wants to build a utility to facilitate peer discovery, that they should do so initially as a separate utility that communicates with the admin socket to get, add and remove peers. That way we are not building in additional fragility into Yggdrasil.

There have been some discussions about a tool that uses a kind of DHT to distribute peer information, or the ability to stand up centralised trackers where people can "call in" their location and get back a list of suitable peers. Both of these approaches would require the owner of the node to volunteer their geographical location as the network is at its most optimal when people peer with nodes that are close to them (as the resultant latency is lower), so it should be especially clear when exchanging location info happens.

If it works well enough and has wide enough acceptance in the community then we'll even gladly consider distributing it within our packages, or if we find a way of doing it that is agnostic and not internet-dependent (e.g. a tracker approach) then that's something we might eventually look into integrating, but I don't think a DHT would fit that description.

Information about adding and removing peers via the admin API is available here so please feel free to prototype your own tools.

@neilalexander neilalexander changed the title Automated peer discovery [feature request] Automated peer discovery Mar 17, 2019
@neilalexander neilalexander added the feature request Request for a new feature or change label Mar 17, 2019
@passenger245
Copy link

Maybe a good kickoff would be to have the public peers formatted either by styling guidelines or formatted with JSON or Yaml etc to be able to extract the peers. Currently there are variations with the styling of the published peers

@azazar
Copy link
Author

azazar commented Mar 18, 2019

I like the link-agnostic nature of Yggdrasil, and automatic peer discovery on the internet isn't going to break that. I'm not asking to throw away automatic local peer discovery. Every link type is different and approaches differ. Why not treat internet as a special link type? And why not use everything

Yggdrasil is already depends on internet. We already using internet for peering. It's the only option for long distance connectivity for now. It's the only option for any connectivity for the majority of peers. How are we supposed to get rid of that dependency? The only scenario I think, which is closer to realistic is employing Microsoft's "embrace extend" strategy to the internet. Are there any other?

@Arceliar
Copy link
Member

Arceliar commented Mar 19, 2019

I've thought about this a lot, and I haven't come up with a solution that doesn't fail in at least one of the following ways:

  1. Not distributed (centralized or federated), which would make the network fragile by introducing extra failure modes.
  2. Doesn't scale (so the network is crushed under the weight of autopeer metadata, even for nodes that aren't trying to use it).
  3. Can't deal with non-transitive connectivity, so it won't actually succeed in finding peers in a lot of cases (it leads to net-splits as a side effect). Specifically, it wouldn't be possible to find nodes outside the same connected component of same-network links without something like walking the DHT (which fails 2).
  4. Finds terrible peers.

Of those, the current setup we have (a centralized list of public peers) fails 1. That's why it lives outside of the main yggdrasil binary. Writing tools to make working with 1 (or otherwise make finding peers easier) is fine, even encouraged, but that's not going to end up in the main yggdrasil binary.

No solution is allowed to fail 2, because then you have a game theoretic perverse incentive to keep the network small, which defeats the purpose of having a scalable mesh network (and if you don't need to scale, then you should use something else--babel and batman are both very good at what they do!).

Failure mode 3 is the hard one. For example, putting connection info into the DHT fails 3 -- not all nodes in the DHT are reachable from all networks, so we either fail to find peers or we need to walk the DHT until we find someone, which fails 2 and kills the network. In either case, we still need public nodes to bootstrap, and those could easily be overloaded if we're not intelligent enough about how we use them, so it doesn't really avoid problem 1 either. The issue with 3 is how insidious the failure mode is. If we succeed at everything else, and local mesh networks (not dependent on the internet) start to grow, then the fraction of nodes with internet connections will begin to drop. That means, the more real/direct links we add between nodes, the more easily broken the automatic internet linking becomes, so the network could break suddenly if it's used for its intended purpose. Also, the DHT doesn't have (and cannot have, because of how all DHTs work, AFAIK) a way to ask for peers from any particular area, so doing the obvious thing would fail 3 and 4 at the same time.

Failure mode 4 is the only one that's tolerable in the long run, because bad links are better than no links, and if we're doing things right then the bad links wouldn't be used much (only when you happen to try to contact someone very near the destination of that link). If I find something that fails only 4, +- bootstrap issues, then that's acceptable eventually, but we don't want to add anything like that in the near future. I want the routing logic to be finalized before we start adding a lot of dynamic and bad links, otherwise it will make everything harder to debug. Also, this isn't necessarily something a home user would want to enable, because then you're very likely to route traffic between some of the many bad links. Even in the cases where the links are good, there's no guarantee that this isn't worse for the network than not having them.

In any case, our goal (for the foreseeable future) is not to replace the internet, is to figure out how to replace the internet. Automatic peering doesn't help with that, and would actively make it more difficult in the short term, so it's unlikely to happen any time soon.

@azazar
Copy link
Author

azazar commented Mar 20, 2019

At the moment we have only a centralized peer list that doesn't scale and gives us a very short list of peers of random quality. It fails in at least two ways at once. Isn't having federated list is better than centralized? If there is no ideal solution, and currently used solution is actually a worst one, then why not just use the best of available options? Why not use federated lists instead of centralized? Why not experiment with automatic peer selection algorithms for finding better peers, than using peers handpicked from public list?

And the original question had not that much to do with all of those. It's mostly about automation of things we already have. Automation isn't going to change anything regarding to network stability, it's just a way to make Yggdrasil more user friendly and convenient. A way to attract more people to the project. Isn't that's something the project needs?

@passenger245
Copy link

passenger245 commented Mar 20, 2019

https://github.com/passenger245/yggdrasil-peer-tools/tree/master/exportPeers
outputs at https://api.yggdrasil.icu/peers.json with geoip coordinates.

I guess such output could be used with a script on a webserver that retrieves the client IP and compares which nodes are closest. Maybe some ICMP ping for latency checks somewhere. I'm thinking of a CLI tool that calls a webserver URL for a handful of peers and uses the yggdrasil api tools to insert/sync the nodes.

@passenger245
Copy link

Linux/BSD? can use https://raw.githubusercontent.com/passenger245/yggdrasil-peer-tools/master/importPeers/linux-add.sh

It adds by default five nearby peers. For options give it a --help flag. For feature/bugs/PR please use the repository

@azazar
Copy link
Author

azazar commented Aug 24, 2019

It doesn't make connection to Yggdrasion seamless. And it seems to fail in all ways mentioned by Arceliar.

@VictorNine
Copy link

How about just making a really simple tool. This can of course be made more advanced in the future.

My suggestion: Instead of new users picking public peers at random a tool can be crated that finds the best ones automatically. It pulls the public peer list and pings them all to figure out who is the closest/best peers for you. Could be part of Yggdrasil or a separate tool.

Thoughts? Worth implementing?

Should be able to implement this pretty quickly.

@zhoreeq
Copy link
Contributor

zhoreeq commented Jan 8, 2021

@VictorNine Alternative client Popura has autopeering feature which works something like that.

@VictorNine
Copy link

Thanks

@zhoreeq
Copy link
Contributor

zhoreeq commented Jan 8, 2021

@VictorNine no problem!

If you're up to golang hacking, implementation feedback is welcome :-)

@crocket
Copy link

crocket commented Oct 5, 2021

When we talk about building automated peer discovery into Yggdrasil, this means that we are building in a dependency on the Internet, as global automatic peer discovery will not typically work on networks that can't reach the Internet. This is contrary to the project goals and it may create some expectations that may turn out to be true when used over the Internet but false or misleading elsewhere.

I think it's reasonable to try automatic discovery of peers on layer 2 protocol such as ethernet and WiFi.

Linking up computers in the same layer 2 network segment can and should be automated.

Automatic layer 2 discovery will allow people to quickly build local yggdrasil networks.

OpenWrt routers could run yggdrasil and discover each other through automatic link layer discovery. Computers connected to routers get connnection to yggdrasil network automatically without yggdrasil software because routers know how to route to yggdrasil network.

Perhaps, somebody can sell yggdrasil routers.

@neilalexander
Copy link
Member

For reasons already discussed this is not something we will do at this time.

@neilalexander neilalexander closed this as not planned Won't fix, can't repro, duplicate, stale Oct 21, 2023
@stevefan1999-personal
Copy link

stevefan1999-personal commented Jul 9, 2024

Despite necroposting, but I want to shed some words on this.

There have been some discussions about a tool that uses a kind of DHT to distribute peer information, or the ability to stand up centralised trackers where people can "call in" their location and get back a list of suitable peers. Both of these approaches would require the owner of the node to volunteer their geographical location as the network is at its most optimal when people peer with nodes that are close to them (as the resultant latency is lower), so it should be especially clear when exchanging location info happens.

A well-known example for this is Tor, where they publish their list of peers in the form of a DHT directory that is publicly available. However, they hard-coded all the identity in the Tor program itself, including the public key and expected contact points, making it super easy to be blocked by IDS and DPI for censorship. This is apparently why Tor is so hard to access in China, and they have to resort to manually resolve a partition of the peers through a private online service.

By the way, using a public directory is also susceptible to Sybil attack. A DoS technique would be that a bad actor can reverse engineer the protocol, create a malicious peer, register in the public directory, rinse and repeat to create a big majority. Then the malicious peer could either do nothing, dropping any request from other peers, or even log the public IP. This might not stop Yggdrasil as a whole, but it will sure damage a lot of routing decisions and result in the situations of both point 3 and point 4 as of what @Arceliar suggested.

IIRC Tor tried to mitigate this by awarding healthy nodes that served the network long enough, in the form of consensus voting. We can also try and use a PoW approach for this too.

Speaking of Sybil attack, this paper might be useful: https://www.cl.cam.ac.uk/~rja14/Papers/sybildht.pdf and https://dspace.mit.edu/bitstream/handle/1721.1/61338/Kaashoek_Whanau%20A.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature or change
Projects
None yet
Development

No branches or pull requests