cURL / Mailing Lists / curl-library / Single Mail


Re:Re: How to get HTTP charset? I wanna do charset conversion(or maybe libcurl already has this feature)

From: kartwall <>
Date: Thu, 10 Jun 2010 18:30:36 +0800 (CST)

Hi, Daniel:

> On Thu, 10 Jun 2010, kartwall wrote:
> > I wanna convert all http responses to UTF-8 because, you know, not all
> > web pages are written in UTF-8. I skimmed the manual of "curl_easy_setopt",
> Not really. The purpose of that functionality is for platforms that do not
> speak ASCII natively to provide a way to make the protocols we use that are
> ascii-based to still work fine.

    Thanks a lot. But I don't understand what is non-ASCII platform? A Chinese or Japanese PC which uses Chinese or Japanese as the default language? If so, why libcurl needs to convert strings? Almost all protocols(such as HTTP, FTP) are all ASCII based text protocols. Maybe I misunderstanding something, so I think could you give me a code example about these 2 options or something else to help me out?

> > But here is a big question: How can I know the charset of html file which I
> > received?
> HTML is contents that libcurl may deliver. How to deal with that data is
> beyond what libcurl knows or cares about. You would need to read up on how
> HTML works to figure this out. Of course, there may be HTTP headers in some or
> many cases that help you out.
    Yeah, I got it. I found in HTTP response headers, "Content-Type: text/html; charset=UTF-8" is what I want. I can check out the charset here. Then I can use iconv to convert between different charsets.

    Thanks again.

> -------------------------------------------------------------------
> List admin:
> Etiquette:

List admin:
Received on 2010-06-10