curl / Mailing Lists / curl-library / Single Mail

curl-library

Re: streaming data with libcurl

From: Ray Satiro via curl-library <curl-library_at_cool.haxx.se>
Date: Sun, 7 May 2017 02:14:02 -0400

On 4/25/2017 8:28 PM, Rui Wang via curl-library wrote:
> I'm trying to process some binary data stream with libcurl. The data
> format is like the following:
>
> 4 bytes header that contains length of attributes, then attributes,
> then 8 bytes header that contains length of data, then data.
>
> Both attributes and data are optional. So it could be 4bytes followed
> by attributes, then 8bytes that contains a 0, another 4 bytes followed
> by attributes, and so on.
>
> What I want to do is to process the attributes or the data as it
> arrives instead of waiting for everything. I could use the
> writefunction to parse the data and extracts attributes or data when
> it's ready. However, I can only write that data to somewhere, but I
> can't pipe that data item into another function since the function
> signature is already there. I don't know if I could have a custom
> callback function triggered once the data item is ready just like the
> writefunction. That would be the most efficient way to handle the
> incoming stream.

[...]

On 4/26/2017 2:36 PM, Rui Wang via curl-library wrote:
> Actually, I was blocked on how I could pass a callback to the
> writedata callback. Last night I thought about this idea: there is a
> custom data structure that I could parse into writedata function, in
> the example 'getinmemory' it only contains size and a pointer to the
> memory buffer that holds the data. I could also include a function
> pointer field in this structure, then I could have custom callback to
> execute when the data is ready. I feel that conceptually this should
> work, but am not sure if I miss anything. Do you think this works?

Please don't top-post it makes the conversation hard to follow [1].

As best I can understand from your replies, you want to parse each
completed data section as it arrives instead of "waiting for
everything", and each data section in the stream is in this format:

[[<4 bytes: attribute size><attributes><8 bytes: data size><data>]...]

Yes, you can build on the getinmemory example to parse that. I made an
example called ParseStream [2] that's built off of getinmemory and shows
one way it can be done. If you click on to the 'Revisions' tab of the
ParseStream example you can see the changes between it and getinmemory.

Notably in the WriteFunction if the HTTP server code is not 200 then the
data is ignored and nothing is written. However if 200 then we assume
the stream is valid, and the received data is appended to the rest of
the unparsed data in memory. Parsing is attempted by calling function
ParseStream on the unparsed data. If any sections are parsed they are
individually sent to another function, Notify, which outputs them to screen:

**************************************************************************
Notify: A data section has been received.

Attributes length is 3 bytes (0x3)
00000000: 66 6f 6f foo

Data length is 3 bytes (0x3)
00000000: 62 61 72 bar
**************************************************************************

You'd replace Notify with wherever you are sending the data. ParseStream
then discards the parsed data and returns, and then the WriteFunction
returns. libcurl waits until more data is received, and the cycle continues.

Regarding your question about queues, threading, etc, yes there are a
bunch of ways you can handle the data. Which way is right for you I
don't know. My example shows the easiest way of doing it, parse/notify
blocks in the WriteFunction.

Note I used little endian to decode the attribute and data sizes, if
they're sent big endian then replace with curl_endian.c read{32,64}_be
functions [3].

[1]: https://curl.haxx.se/mail/etiquette.html#Do_Not_Top_Post
[2]: https://gist.github.com/jay/f355d98e87fde19b1455b0b31dd118fd
[3]: https://github.com/curl/curl/blob/curl-7_54_0/lib/curl_endian.c#L156

-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2017-05-07