cURL / Mailing Lists / curl-library / Single Mail

curl-library

curl html encoding

From: Timo Lange <tilange_at_mail.uni-paderborn.de>
Date: Mon, 20 Oct 2014 10:27:53 +0200

Hey there,

I'am using the c++ library of curl to download a website into a text-file.
Unfortunately all the links contained in the website are not encoded
correctly.
For example & becomes &amp; and = becomes %3D.
Do you have any idea how to avoid this?
Bellow you will find the curl part of my code.

Best regards
Timo
___________________________________________________________________

void download_to_file(char *url, char resultfile[FILENAME_MAX])
{
     CURL *curl;
     CURLcode res;
     FILE *fp;
     curl = curl_easy_init();
     if(curl)
     {
        fp = fopen(resultfile,"wb");
        curl_easy_setopt(curl, CURLOPT_URL, url);
        curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L);
        curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
        curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
        curl_easy_setopt(curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows
NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0");
        res=curl_easy_perform(curl);
        /* Check for errors */
              if(res != CURLE_OK)
                 fprintf(stderr, "curl_easy_perform() failed: %s\n",
                      curl_easy_strerror(res));

           curl_easy_cleanup(curl);
           fclose(fp);
    }
    else
    {
        cout << "curl was not initialized correctly\n";
    }
}

-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2014-10-20