cURL / Mailing Lists / curl-library / Single Mail

curl-library

Connecting to multiple hosts that have the same hostname

From: Michael Kaufmann <mail_at_michael-kaufmann.ch>
Date: Wed, 27 May 2015 21:10:50 +0200

Hi,

I have a challenging scenario for libcurl. I want to connect with
https to a cluster of ADFS hosts (Active Directory Federation
Services). These hosts have the same hostname, but different IP
addresses. For example:
- host.example.org, IP: 10.0.0.1
- host.example.org, IP: 10.0.0.2

When establishing a connection, I want to control to which host
libcurl should connect to (10.0.0.1 or 10.0.0.2) because the hosts
have separate user sessions.
Furthermore ADFS needs SNI, so it is necessary to use the hostname in
the URL (https://host.example.org). It is not possible to just use the
IP address (https://10.0.0.1/ or https://10.0.0.2/).

As suggested in many mails on this mailing list, I have tried to solve
this using CURLOPT_RESOLVE. I have found two problems:
- CURLOPT_RESOLVE is not a "local" setting because it pre-populates
the DNS cache. All easy handles that use the same multi handle share a
DNS cache. Setting different IP addresses for the same hostname using
CURLOPT_RESOLVE may therefore lead to race conditions.
- A "wrong" existing connection may get reused. libcurl only looks at
the hostname, and does not remember that the connection has been set
up using a special IP address with CURLOPT_RESOLVE. (I have not tested
this, but I have looked at libcurl's source code.)

It is possible to work around these problems by using CURLOPT_RESOLVE
with an artificial hostname in the URL, e.g.
"https://host.example.org-10.0.0.1/" - but then the server drops the
connection, because the artificial hostname is also used for SNI. A
libcurl option to control the SNI hostname (and also the hostname used
for the server certificate checks) would make this workaround viable.

A "clean" solution needs to address the problems mentioned above:
- CURLOPT_RESOLVE should only affect a single easy handle
- A new option to control the connection cache's behavior (reuse based
on hostname only or reuse based on hostname + IP address). This may be
implemented in a more general way with a "label" (string) that gets
attached to a connection, and libcurl is only allowed to reuse
connections when the labels match.

What do you think? All ideas and suggestions are welcome.

Regards,
Michael

-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2015-05-27