cURL / Mailing Lists / curl-library / Single Mail

curl-library

DNS Cache

From: Sterling Hughes <sterling_at_bumblebury.com>
Date: Tue, 8 Jan 2002 05:56:53 +0100

    Recently I've added a DNS cache to libcURL which will cache DNS
    lookups on a per-process basis, with a few exceptions/special cases
    also supported.

    libcurl tries its hardest to, by default, support the multi-threaded
    programming paradigm. Therefore, by default having a global dns
    cache is hard, simply because, cURL, as it is now, has no concept of
    mutexes, and therefore cannot assure sequential access to a global
    cache, across multiple concurrent threads. Therefore, by default,
    the cURL dns cache works on a per-handle basis:

    ie::
    
    int main(void)
    {
        CURL *c1;
        CURL *c2;

        c1 = curl_easy_init();
        curl_easy_setopt(c1, CURLOPT_URL, "http://catalogs.google.com/");
        curl_easy_perform(c1);
        curl_easy_cleanup(c1);

        c2 = curl_easy_init();
        curl_easy_setopt(c2, CURLOPT_URL,
                         "http://catalogs.google.com/catalog_list");
        curl_easy_perform(c2);
        curl_easy_cleanup(c2);
    }

    will cause two DNS lookups because the first dns lookup on c1 will
    be stored in a local cache, present on the c1 handle. The same will
    happen with the c2 handle. This cache is still useful simply
    because using cURL you can re-use a handle many times, and the cache
    will survive the lifetime of the individual requests.

    When using the new cURL "multi" interface, multiple handles will be
    able to share the same transfer and connection space. Furthermore
    these handles are guaranteed to be linked together, therefore, we
    can safely share a cache between the handles. So, re-writing the
    above example to use the multi interface:

    int main(void)
    {
        CURL *c1;
        CURL *c2;
        CURLM *m;
        struct timeval to;
        int running;
        int rc;
        int max;
        fd_set read;
        fd_set write;
        fd_set except;

        c1 = curl_easy_init();
        c2 = curl_easy_init();

        curl_easy_setopt(c1, CURLOPT_URL, "http://catalogs.google.com/");
        curl_easy_setopt(c2, CURLOPT_URL,
                         "http://catalogs.google.com/catalog_list");

        m = curl_multi_init();

        curl_multi_add_handle(m, c1);
        curl_multi_add_handle(m, c2);

        while (CURLM_CALL_MULTI_PERFORM ==
              curl_multi_perform(m, &running));

        while (running) {
            FD_ZERO(&read);
            FD_ZERO(&write);
            FD_ZERO(&except);

            to.tv_sec = 1;
            to.tv_usec = 0;

            curl_multi_fdset(m, &read, &write, &except, &max);
            rc = select(max+1, &read, &write, &except);
            switch (rc) {
            case -1: /* Error */
            break;
            case 0:
            default:
                curl_multi_perform(m, &running);
                break;
            }
        }

        curl_multi_cleanup(m);

        curl_easy_cleanup(c1);
        curl_easy_cleanup(c2);

        return 0;
    }

    Would cause 1 DNS lookup, since the cache is shared between members
    of the CURLM handle (m).

    Finally, in many cases you don't care about Threadsafety, because,
    well, your application doesn't use threads, and therefore, you might
    want to exploit the advantages of using a global DNS cache. One
    such use is in PHP with the Apache webserver. Apache 1.3 & co. use
    a pool of pre-forked processes to serve requests. This allows for
    two seperate Apache states: Actions to perform on process
    startup/shutdown, and actions to perform on request startup/shutdown.
    Curl handles cannot survive the request startup and shutdown,
    however, a *global* dns cache, can be maintained on a per-process
    basis, therefore, caching dns lookups over quite a few requests,
    resulting in a very nice performance gain in many cases. In order
    to enable, global DNS caching you can set the
    CURLOPT_DNS_USE_GLOBAL_CACHE option to non-false (ie, 1).

    Above are the three different ways that cURL cache's DNS lookups --
    per handle by default with the easy interface, per handle pool with
    the multi interface, and globally when DNS caching is enabled for
    non-threaded applications.

    Another problem with DNS cach'ing is that you may have applications
    that run for a loooooooooooooongggggggggggggggggggg time, and
    therefore you may have changes in DNS information (ie,
    catalogs.google.com may resolve to 10.0.0.143 instead of
    10.0.0.112). libcurl does not try and be too smart for you (ttl's
    from nameservers are a tricky business), but rather leaves the
    responsibility on the programmer to set the cache expiration (it
    defaults to 60 seconds) with the CURLOPT_DNS_CACHE_TIMEOUT option,
    which expects a second-based timeout, ie, to have the cache timeout
    every 15 seconds, you would use:

    curl_easy_setopt(handle, CURLOPT_DNS_CACHE_TIMEOUT, 15);

    With two special cases, 0 and -1... O will completely disable dns
    caching and -1 will make it so that the DNS cache *never* expires.

    *phew, long breath* :)

    -Sterling
Received on 2002-01-08