cURL / Mailing Lists / curl-library / Single Mail

curl-library

Re: using easy curl

From: Julien Chaffraix <julien.chaffraix_at_gmail.com>
Date: Wed, 17 Feb 2010 20:12:37 -0800

>> Is there an http parser class for c++ available?

> I assume you mean an HTML parser class.

> I don't know about C++ but for plain C, any one of
> these might do:
[snip some links]
> http://expat.sourceforge.net/

Actually expat is an XML parser so it will not work on HTML pages.
Be also sure to use the HTML parser
(http://xmlsoft.org/html/libxml-HTMLparser.html) for libXML2.

Depending on your usage, you may go directly for the whole web-engine.
Beware as this is a huge overhead but you get:
- support for CSS (some content is generated by CSS which can include images)
- JavaScript execution (again some content is generated by JavaScript)
- other features like support for other markup languages (MathML, SVG
...) and bleeding-edge support for HTML5

Examples of OpenSource web-engine are WebKit (http://webkit.org) or
Gecko (https://developer.mozilla.org/en/Gecko). Those are the most
feature-rich ones but there must be more lightweight solutions (which
I don't know).

Regards,
Julien
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2010-02-18