Proposal - Distributed community LCD endpoint

Given the need for more distributed public infrastructure nodes, we propose a community based lcd endpoint which can front a pool of endpoints hosted by the community. The idea here is to use a proxied load-balancer from cloudflare to front much of the ddos load, and internally proxy it to a known set of community endpoints.

We have created a cloudflare load-balancer endpoint at https://lcd.lunatics.dev, which currently proxies to just one of our public lcd node. This is just a poc to demonstrate the concept and isn’t production ready yet. The access is currently rate-limited

How will it work?

lunatics-lcd-drawio

  • The proxied load-balancer will be hosted on main domain(lcd.lunatics.dev). Cloudflare can handle ddos to large extent at this point. Traffic would be https to avoid mitm
  • Community members(validators, enthusiasts) can host a https based public endpoint of their own. Refer our post on this - medium post
  • Traffic from load-balancer to nginx also needs to be https to avoid mitm. Nginx can proxy the http based lcd node. It is recommended to host this nginx node on a separate machine to isolate targeted ddos
  • Load-balancer can be configured to route traffic to this pool of endpoints. It can also be configured to route traffic based on geography. Announce your endpoints here for testing - sheet link
  • To rule out lcd nodes whose full nodes are out of sync, load-balancer health-checks can be configured to query /syncing and look for a particular string(“false”). When endpoints return syncing status as true, they will automatically be marked as unhealthy until next check and excluded from pool

Challenges

  • Response time might be a little higher due an additional https based hop
  • What if this load-balancer goes down or is compromised? - We could have multiple community load-balancers hosted on different providers and point them to same pool of endpoints
  • Right now terra sdks accept only one url for lcd. It would be good if sdks accept multiple backup lcd urls, and use them when the primary lcd goes down. This community lcd can serve as a backup lcd to begin with

If this method works out for lcd, it can be extended to fcd.
Open for discussion now

2 Likes

Thanks for this proposal, the infrastructure’s decentralization is a major point to focus on to make sure Terra is unstoppable.

However, we see some major downsides to this method:

  • Centralized single point of failure
  • Prone to DNS censorship
  • Single entity managing cloudflare (prone to targeted attacks, social engineering, censorship and require trust)
  • Consequent operational load needed to maintain it
  • If a node is synced up but weak/not powerful enough, it will affect everybody using this “distributed” endpoint from time to time

It appears to us that the best way of decentralizing infrastructure is:

  1. Make it easy to run
  2. Have multiple parties running multiple instances
  3. Let the end users individually decide which endpoint to use in webapps/extensions

Of course, it is possible to assist the user with curated lists. Even this part could be decentralized further by making them importable from ipfs hash or url to enable anyone to provide one (ie. uniswap token lists), but that’s another topic.

TL;DR: we agree on the idea of having multiple redundant providers, but not on having a centralized single point of failure in front of it, which is the main topic of this proposal. Instead we should let the users individually choose the endpoint of their choice.

1 Like

Thanks for the feedback.

Apart from a centralised load-balancer, individual endpoints are always open for connection. Like our endpoint https://lcd.terraindia.info is independently accessible without the load-balancer(lcd.lunatics.dev)

Load-balancer offers few advantages:

  • Their specialised hardware & software offer real time protection from ddos. Our nginx nodes might not be able to sustain a targeted attack
  • Optimal routing based on geography
  • Load-balancers usually don’t go down. Cloudflare claims that their throughput is 15 times bigger than the largest ddos

These days, its pretty common to host rest services through a load-balancer. Digging nameservers of lcd.terra.dev, it appears to be behind a cloudflare load-balancer

Concerns about:

  • Single entity managing cloudflare (prone to targeted attacks, social engineering, censorship and require trust) >> That’s right. This was just a poc. Ideally, we need multiple community endpoints (load-balancers)
  • If a node is synced up but weak/not powerful enough, it will affect everybody using this “distributed” endpoint from time to time >> This comes down to incentives for hosting such services.