curl / Mailing Lists / curl-users / Single Mail

curl-users

Re: Scrape text from the screen

From: Daniel Lublin <daniel_at_lublin.se>
Date: Mon, 18 Jun 2018 15:24:51 +0200

> I have a web page where the text is displayed from a sever directly onto the
> screen .. Hence data not found in the web page source code.
>
>
>
> How can I use Curl to scrape the text from the screen buffer ?? .. the
> displayed data can go over multiple screens .

Hi Mike,

It sounds like you want to extract words, letters and digits, from an image.
This is not something that curl does. Simply put it, curl downloads
documents, like texts or images, from a location (URL).

The process of extracting words from an image is called optical character
recognition, OCR. You might want to search for some tool that does OCR on a
screenshot, or an image of screen-targeted printed text -- as opposed to OCR
on a scanned, physical paper -- I think the workflow and processes may
differ some.

If the web page in question just display large images, you might end up
using curl to download these images and feed them into your OCR tool, but
that's it.
-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2018-06-18