Re: what is the best way to extract urls from web page with?
Date: Tue, 6 Nov 2007 13:03:51 +0200
On 06.11.2007, at 12:27, <hallouina-ml_at_yahoo.fr> wrote:
> Hello everybody,
> Please, I want to know how to extracts urls from a web page with C+
> +? I
> first think to used regex, but with this way I can only extract the
> first url maybe? Or I think after to handle my webpage like an html
> tree, like treebuilder in perl. A friend say that this is slow to do
> like this. Otherwise I don't know the kind of tools to extract url
> my page like a tree.
> What'is the best way? If this is to handle the page like a tree, what
> kind of simple library could I used please?
I think with any regular expression engine provides searching
functionality. You can use boost::regex_search, for example. Here is
or check this one (it contains a reference to the examples page)
I think there's no need to build a tree unless you really need it.
Oh! If you're using C, then try PCRE and check for
-- Regards, Yaroslav SamchukReceived on 2007-11-06