cURL / Mailing Lists / curl-users / Single Mail

curl-users

libcurl and IDNA

From: Gisle Vanem <gvanem_at_broadpark.no>
Date: Tue, 6 Apr 2004 14:44:41 +0200

With the recent possibility to register domain-names with
non-ASCII characters it would be nice if libcurl would support
that in some way.

What would happen in curl now if one enters some IDN in some
east-asian encoding? I guess it would break in sscanf() etc. (but
maybe UTF-8 works?)

This should ideally be the task of the OS or tcp/ip stack, but
since the standard is very new, it's not. Besides it would not work for
protocols that exposes hostnames in the app. layer.
E.g. HTTP 1.1 "Host" header must include the domain-name on ACE
(ASCII Compatible Encoding) form. So this won't work if the web-server
serves multiple domains:
  GET /some/document HTTP/1.1
  Host: www.tromsų.no

But this should work:
  GET /some/document HTTP/1.1
  Host: www.xn--troms-zuA.no

(try it in curl and see you'll get 2 different outputs).

There are several IDN libraries around, but GNU libidn looks promising.
http://josefsson.org/libidn/. A drawback is that it requires iconv that adds
approx 1MByte of data/code for all those charset tables. On Windows it
would be better off with using WinNLS. But that's another matter.

Any comments?

Ref. RFC-3490
Internationalizing Domain Names in Applications (IDNA)

--gv
Received on 2004-04-06