cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: Slow DNS lookups compared to wget

From: Rick Richardson <rickr_at_mn.rr.com>
Date: Mon, 18 Feb 2002 12:42:28 -0600

On Mon, Feb 18, 2002 at 09:06:23AM -0600, Rick Richardson wrote:
> On Mon, Feb 18, 2002 at 09:54:24AM +0100, Daniel Stenberg wrote:
> >
> > You either do as Troy suggests, or you grab a source archive and build
> > yourself. It really isn't hard. Even for someone who never did it before
> > (which doesn't necessarily mean you, I just mean that it is easy).
>
> In the end, its not me I'm worried about, its the end users of my software
> having to perform the incantations.
>
> I rebuilt curl version 7.9.3 binaries from the Redhat rawhide sources as Troy
> suggested, but the problem still remains:
>
> $ cd /tmp/curl
> $ LD_PRELOAD=usr/lib/libcurl.so.2 usr/bin/curl --version
> curl 7.9.3 (i386-redhat-linux-gnu) libcurl 7.9.3 (OpenSSL 0.9.6b) (ipv6 enabled)
>
> $ LD_PRELOAD=usr/lib/libcurl.so.2 time \
> usr/bin/curl -s "http://quote.bloomberg.com/markets/earnings/ecal.cgi" > xxx
> 0.01user 0.00system 0:30.89elapsed
>
> I think that you may be on to something with this getaddrinfo() lead.
>
> I'll try writing a short test program to see if getaddrinfo() is
> broken when trying to resolve this particular DNS name. N.B. the
> problem only occurs with *some* DNS names, not all.

Yes, the problem seems to be a broken "getaddrinfo()" for some IP
lookups, as invoked by curl and as implemented by RH7.2's libc 2.2.4
and kernel 2.4.9-21. Dunno if any other combinations are affected. I
have attached a short test program. Here are the results:

        $ ./getaddrinfo www.yahoo.com www.bloomberg.com
        www.yahoo.com: rc = 0, time = 1 secs
        www.bloomberg.com: rc = 0, time = 29 secs

I dunno enough about DNS internals to be able to point the finger of
blame here. Curl's "configure" script automagically selects ipv6 and
claims that there is a working getaddrinfo(), so it uses that instead
of the tried-and-true gethostbyname(). Here's the strace output of
the test program.

The elapsed time is directly proportional to the number of nameservers
I have listed in /etc/resolv.conf. If I eliminate the secondary and
tertiary nameservers, then it takes "only" 10 seconds to resolve the
name. With 2, it takes ~20 seconds, With all three, it takes ~30
seconds. It is not sensitive to whose nameservers I use. If I switch
to using the nameservers of another ISP, I get the same delays.

        $ strace -t -o xxx ./getaddrinfo www.bloomberg.com
        www.bloomberg.com: rc = 0, time = 28 secs

12:18:38 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
12:18:38 connect(3, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("24.26.163.32")}}, 28) = 0
12:18:38 send(3, "\30\325\1\0\0\1\0\0\0\0\0\0\3www\tbloomberg\3com\0\0"..., 35, 0) = 35
12:18:38 gettimeofday({1014056318, 88757}, NULL) = 0
12:18:38 poll([{fd=3, events=POLLIN}], 1, 5000) = 0
12:18:43 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
12:18:43 connect(4, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("24.26.163.33")}}, 28) = 0
12:18:43 send(4, "\30\325\1\0\0\1\0\0\0\0\0\0\3www\tbloomberg\3com\0\0"..., 35, 0) = 35
12:18:43 gettimeofday({1014056323, 96361}, NULL) = 0
12:18:43 poll([{fd=4, events=POLLIN}], 1, 3000) = 0
12:18:46 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 5
12:18:46 connect(5, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("24.94.163.165")}}, 28) = 0
12:18:46 send(5, "\30\325\1\0\0\1\0\0\0\0\0\0\3www\tbloomberg\3com\0\0"..., 35, 0) = 35
12:18:46 gettimeofday({1014056326, 106514}, NULL) = 0
12:18:46 poll([{fd=5, events=POLLIN}], 1, 6000) = 0
12:18:52 send(3, "\30\325\1\0\0\1\0\0\0\0\0\0\3www\tbloomberg\3com\0\0"..., 35, 0) = 35
12:18:52 gettimeofday({1014056332, 116333}, NULL) = 0
12:18:52 poll([{fd=3, events=POLLIN}], 1, 5000) = 0
12:18:57 send(4, "\30\325\1\0\0\1\0\0\0\0\0\0\3www\tbloomberg\3com\0\0"..., 35, 0) = 35
12:18:57 gettimeofday({1014056337, 126429}, NULL) = 0
12:18:57 poll([{fd=4, events=POLLIN}], 1, 3000) = 0
12:19:00 send(5, "\30\325\1\0\0\1\0\0\0\0\0\0\3www\tbloomberg\3com\0\0"..., 35, 0) = 35
12:19:00 gettimeofday({1014056340, 136488}, NULL) = 0
12:19:00 poll([{fd=5, events=POLLIN}], 1, 6000) = 0
12:19:06 close(3) = 0
12:19:06 close(4) = 0
12:19:06 close(5) = 0
12:19:06 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
12:19:06 connect(3, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("24.26.163.32")}}, 28) = 0
12:19:06 send(3, "\30\326\1\0\0\1\0\0\0\0\0\0\3www\tbloomberg\3com\vl"..., 47, 0) = 47
12:19:06 gettimeofday({1014056346, 147088}, NULL) = 0
12:19:06 poll([{fd=3, events=POLLIN, revents=POLLIN}], 1, 5000) = 1
12:19:06 recvfrom(3, "\30\326\205\203\0\1\0\0\0\0\0\0\3www\tbloomberg\3com\v"..., 1024, 0, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("24.26.163.32")}}, [16]) = 47
12:19:06 close(3) = 0
12:19:06 gettimeofday({1014056346, 260464}, NULL) = 0
12:19:06 getpid() = 28574
12:19:06 brk(0x804c000) = 0x804c000
12:19:06 open("/etc/resolv.conf", O_RDONLY) = 3
12:19:06 fstat64(3, {st_mode=S_IFREG|0644, st_size=109, ...}) = 0
12:19:06 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40018000
12:19:06 read(3, "nameserver 24.26.163.32\nnameserv"..., 4096) = 109
12:19:06 read(3, "", 4096) = 0
12:19:06 close(3) = 0
12:19:06 munmap(0x40018000, 4096) = 0
12:19:06 socket(PF_UNIX, SOCK_STREAM, 0) = 3
12:19:06 connect(3, {sin_family=AF_UNIX, path="/var/run/.nscd_socket"}, 110) = -1 ENOENT (No such file or directory)
12:19:06 close(3) = 0
12:19:06 open("/etc/hosts", O_RDONLY) = 3
12:19:06 fcntl64(0x3, 0x1, 0, 0x1) = 0
12:19:06 fcntl64(0x3, 0x2, 0x1, 0x1) = 0
12:19:06 fstat64(3, {st_mode=S_IFREG|0644, st_size=363, ...}) = 0
12:19:06 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40018000
12:19:06 read(3, "# Do not remove the following li"..., 4096) = 363
12:19:06 read(3, "", 4096) = 0
12:19:06 close(3) = 0
12:19:06 munmap(0x40018000, 4096) = 0
12:19:06 open("/var/nis/NIS_COLD_START", O_RDONLY) = -1 ENOENT (No such file or directory)
12:19:06 open("/var/nis/NIS_COLD_START", O_RDONLY) = -1 ENOENT (No such file or directory)
12:19:06 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
12:19:06 connect(3, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("24.26.163.32")}}, 28) = 0
12:19:06 send(3, "\323u\1\0\0\1\0\0\0\0\0\0\3www\tbloomberg\3com\0\0"..., 35, 0) = 35
12:19:06 gettimeofday({1014056346, 262948}, NULL) = 0
12:19:06 poll([{fd=3, events=POLLIN, revents=POLLIN}], 1, 5000) = 1
12:19:06 recvfrom(3, "\323u\205\200\0\1\0\1\0\0\0\0\3www\tbloomberg\3com\0\0"..., 1024, 0, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("24.26.163.32")}}, [16]) = 51
12:19:06 close(3) = 0
12:19:06 socket(PF_UNIX, SOCK_STREAM, 0) = 3
12:19:06 connect(3, {sin_family=AF_UNIX, path="/var/run/.nscd_socket"}, 110) = -1 ENOENT (No such file or directory)
12:19:06 close(3) = 0
12:19:06 open("/etc/hosts", O_RDONLY) = 3
12:19:06 fcntl64(0x3, 0x1, 0, 0x1) = 0
12:19:06 fcntl64(0x3, 0x2, 0x1, 0x1) = 0
12:19:06 fstat64(3, {st_mode=S_IFREG|0644, st_size=363, ...}) = 0
12:19:06 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40018000
12:19:06 read(3, "# Do not remove the following li"..., 4096) = 363
12:19:06 read(3, "", 4096) = 0
12:19:06 close(3) = 0
12:19:06 munmap(0x40018000, 4096) = 0
12:19:06 open("/var/nis/NIS_COLD_START", O_RDONLY) = -1 ENOENT (No such file or directory)
12:19:06 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
12:19:06 connect(3, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("24.26.163.32")}}, 28) = 0
12:19:06 send(3, "\323v\1\0\0\1\0\0\0\0\0\0\0019\003240\003179\003204\7i"..., 44, 0) = 44
12:19:06 gettimeofday({1014056346, 314465}, NULL) = 0
12:19:06 poll([{fd=3, events=POLLIN, revents=POLLIN}], 1, 5000) = 1
12:19:06 recvfrom(3, "\323v\205\200\0\1\0\1\0\1\0\1\0019\003240\003179\00320"..., 1024, 0, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("24.26.163.32")}}, [16]) = 112
12:19:06 close(3) = 0
12:19:06 time([1014056346]) = 1014056346
12:19:06 fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 7), ...}) = 0
12:19:06 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40018000
12:19:06 write(1, "www.bloomberg.com: rc = 0, time "..., 42) = 42

-- 
Rick Richardson  rickr@mn.rr.com        http://home.mn.rr.com/richardsons/

Received on 2002-02-18