cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: DNS-based cluster awareness for connection pools and pipelines

From: David Strauss <david_at_davidstrauss.net>
Date: Fri, 12 Apr 2013 10:32:38 -0700

On Fri, Apr 12, 2013 at 4:22 AM, Daniel Stenberg <daniel_at_haxx.se> wrote:
> Your talk of "load balancing" make me suspect that you may have other ideas
> than that, or what you would load balance between exactly?

Fail-over is the first goal, and having a well-balanced load is a
secondary goal. I should provide a bit of history around our DAV
clients connecting to a cluster of multi-master servers called
Valhalla.

Until a few weeks ago, we've relied on standard hardware load
balancers to perform health checks, avoid routing to problem nodes,
and balance traffic. This worked fine except for some extra latency
until the days where we would run into over-saturated balancers
dropping packets and connections. Because we've been using
cloud-provisioned balancers, the saturation isn't necessarily from our
own traffic. So, evenly distributing to the balancers wouldn't solve
things without smarter client failover, possibly between balancers.

Now, we're in a transition period to using haproxy on each host. To
avoid storms of health checks, it's doing mostly passive ones
(noticing when real requests fail) and supporting round-robin
fail-over. Given the typical fail-over time of 2+ seconds and lack of
much failure learning, this would work poorly if connections had to
open all the time, but they don't. We use persistent HTTPS. Our active
connections stay around for up to 12 idle hours. This scales well with
our event-oriented servers; they don't spend any time on the idle,
persistent connections.

But, this haproxy model has limitations. For each back-end cluster,
there has to be an haproxy. This makes sharding out container
connections more complex than configuring a single client to connect
to the right domain. Distributing updated configuration to haproxy
(and any other balancer) is also hard because we need to kick off
reconfiguration rather than updating something like DNS. We also have
to either put in /etc/hosts entries or disable host validation for
HTTPS.

Meanwhile, cURL has built-in DNS lookup, connection pool management,
and connection re-establishment when reusing a persistent connection
fails. Our ideal would be extending the DNS record awareness into the
retry and pool logic to go from (1) today's ability to reconnect to a
single IP to (2) ability to reconnect using other IPs listed in a DNS
lookup, possibly using weights.

If this were implemented, we would also use it for our PHP and Python
API clients, which also connect through load balancers but don't run
into as many saturation issues.

--
David Strauss
   | david_at_davidstrauss.net
   | +1 512 577 5827 [mobile]
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette:  http://curl.haxx.se/mail/etiquette.html
Received on 2013-04-12