New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding zstd decoding support #5453
Conversation
Under Ubuntu, we can install zstd library with apt install libzstd-dev Can anyone help me integrate a zstd test in the curl test system ? |
To install a server with zstd support |
See test 1170 for inspiration. You basically only need a binary dump of what the server should respond, then put that base64-encoded in the |
I can, by example, translare the three brotli test (314/315/316), but add a build configuration with zstd (probably an Ubuntu with libzstd-dev package) and, configure the test only for zstd build is not easy for me |
I'll write up a configure check for you... |
I put it in a separate branch for you. Try it out and see if it works out. |
lib/content_encoding.c
Outdated
#ifdef HAVE_ZSTD | ||
/* Writer parameters. */ | ||
struct zstd_params { | ||
ZSTD_DStream *zds; /* State structure for zstd. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not curl indent-level! (and more of that later but I won't repeat myself)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I'll have to find a solution with visual studio 2019 and curl rules
lib/content_encoding.c
Outdated
ZSTD_outBuffer out; | ||
size_t errorCode; | ||
|
||
if(zp->decomp == NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we typically use if(!zp->decomp)
instead of checking for NULL
ce5f499
to
c57930c
Compare
I rebased my work on your branch. Do you prefer I just copy your file and rebase on master? |
It is probably easiest if you just copy (cherry-pick) my commit into your branch |
I tried your patch on Windows/MSVC-2019 and it works fine. |
I added libzstd-dev package on an ubuntu focal build on .travis.yml but I'm unable to find it on all the continuous integration build. |
Hey! I'm from the zstd team (and I implemented the support for the zstd content-coding for facebook.com). I'm really excited to see zstd make its way into curl! Let me know if you have any questions. |
Ah strike that, it should build here... |
You can see configure detect and enabled it here - so that much seems to work. But note that you added that to the build that explicitly does not run the test suite. |
776840d
to
51b05d8
Compare
I'm attaching a patch to add zstd support to the MinGW aka patch
diff --git a/docs/examples/Makefile.m32 b/docs/examples/Makefile.m32
index d0694cfa3..e8c0d376f 100644
--- a/docs/examples/Makefile.m32
+++ b/docs/examples/Makefile.m32
@@ -24,7 +24,7 @@
#
## Makefile for building curl examples with MingW (GCC-3.2 or later)
## and optionally OpenSSL (1.0.2a), libssh2 (1.5), zlib (1.2.8), librtmp (2.4),
-## brotli (1.0.1)
+## brotli (1.0.1), zstd (1.4.5)
##
## Usage: mingw32-make -f Makefile.m32 CFG=-feature1[-feature2][-feature3][...]
## Example: mingw32-make -f Makefile.m32 CFG=-zlib-ssl-sspi-winidn
@@ -39,6 +39,10 @@
ifndef ZLIB_PATH
ZLIB_PATH = ../../../zlib-1.2.8
endif
+# Edit the path below to point to the base of your Zstandard sources.
+ifndef ZSTD_PATH
+ZSTD_PATH = ../../../zstd-1.4.5
+endif
# Edit the path below to point to the base of your Brotli sources.
ifndef BROTLI_PATH
BROTLI_PATH = ../../../brotli-1.0.1
@@ -180,6 +184,9 @@ endif
ifeq ($(findstring -zlib,$(CFG)),-zlib)
ZLIB = 1
endif
+ifeq ($(findstring -zstd,$(CFG)),-zstd)
+ZSTD = 1
+endif
ifeq ($(findstring -brotli,$(CFG)),-brotli)
BROTLI = 1
endif
@@ -284,6 +291,11 @@ ifdef ZLIB
CFLAGS += -DHAVE_LIBZ -DHAVE_ZLIB_H
curl_LDADD += -L"$(ZLIB_PATH)" -lz
endif
+ifdef ZSTD
+ INCLUDES += -I"$(ZSTD_PATH)/include"
+ CFLAGS += -DHAVE_ZSTD
+ curl_LDADD += -L"$(ZSTD_PATH)/lib" -lzstd
+endif
ifdef BROTLI
INCLUDES += -I"$(BROTLI_PATH)/include"
CFLAGS += -DHAVE_BROTLI
diff --git a/lib/Makefile.m32 b/lib/Makefile.m32
index fe8701bdb..02b31106c 100644
--- a/lib/Makefile.m32
+++ b/lib/Makefile.m32
@@ -24,7 +24,7 @@
#
## Makefile for building libcurl.a with MingW (GCC-3.2 or later or LLVM/Clang)
## and optionally OpenSSL (1.0.2a), libssh2 (1.5), zlib (1.2.8), librtmp (2.4),
-## brotli (1.0.1)
+## brotli (1.0.1), zstd (1.4.5)
##
## Usage: mingw32-make -f Makefile.m32 CFG=-feature1[-feature2][-feature3][...]
## Example: mingw32-make -f Makefile.m32 CFG=-zlib-ssl-sspi-winidn
@@ -39,6 +39,10 @@
ifndef ZLIB_PATH
ZLIB_PATH = ../../zlib-1.2.8
endif
+# Edit the path below to point to the base of your Zstandard sources.
+ifndef ZSTD_PATH
+ZSTD_PATH = ../../zstd-1.4.5
+endif
# Edit the path below to point to the base of your Brotli sources.
ifndef BROTLI_PATH
BROTLI_PATH = ../../brotli-1.0.1
@@ -180,6 +184,9 @@ endif
ifeq ($(findstring -zlib,$(CFG)),-zlib)
ZLIB = 1
endif
+ifeq ($(findstring -zstd,$(CFG)),-zstd)
+ZSTD = 1
+endif
ifeq ($(findstring -brotli,$(CFG)),-brotli)
BROTLI = 1
endif
@@ -288,6 +295,11 @@ ifdef ZLIB
CFLAGS += -DHAVE_LIBZ -DHAVE_ZLIB_H
DLL_LIBS += -L"$(ZLIB_PATH)" -lz
endif
+ifdef ZSTD
+ INCLUDES += -I"$(ZSTD_PATH)/include"
+ CFLAGS += -DHAVE_ZSTD
+ DLL_LIBS += -L"$(ZSTD_PATH)/lib" -lzstd
+endif
ifdef BROTLI
INCLUDES += -I"$(BROTLI_PATH)/include"
CFLAGS += -DHAVE_BROTLI
diff --git a/src/Makefile.m32 b/src/Makefile.m32
index 802a1da0f..afb4fd547 100644
--- a/src/Makefile.m32
+++ b/src/Makefile.m32
@@ -24,7 +24,7 @@
#
## Makefile for building curl.exe with MingW (GCC-3.2 or later or LLVM/Clang)
## and optionally OpenSSL (1.0.2a), libssh2 (1.5), zlib (1.2.8), librtmp (2.4),
-## brotli (1.0.1)
+## brotli (1.0.1), zstd (1.4.5)
##
## Usage: mingw32-make -f Makefile.m32 CFG=-feature1[-feature2][-feature3][...]
## Example: mingw32-make -f Makefile.m32 CFG=-zlib-ssl-sspi-winidn
@@ -39,6 +39,10 @@
ifndef ZLIB_PATH
ZLIB_PATH = ../../zlib-1.2.8
endif
+# Edit the path below to point to the base of your Zstandard sources.
+ifndef ZSTD_PATH
+ZSTD_PATH = ../../zstd-1.4.5
+endif
# Edit the path below to point to the base of your Brotli sources.
ifndef BROTLI_PATH
BROTLI_PATH = ../../brotli-1.0.1
@@ -189,6 +193,9 @@ endif
ifeq ($(findstring -zlib,$(CFG)),-zlib)
ZLIB = 1
endif
+ifeq ($(findstring -zstd,$(CFG)),-zstd)
+ZSTD = 1
+endif
ifeq ($(findstring -brotli,$(CFG)),-brotli)
BROTLI = 1
endif
@@ -302,6 +309,11 @@ ifdef ZLIB
CFLAGS += -DHAVE_LIBZ -DHAVE_ZLIB_H
curl_LDADD += -L"$(ZLIB_PATH)" -lz
endif
+ifdef ZSTD
+ INCLUDES += -I"$(ZSTD_PATH)/include"
+ CFLAGS += -DHAVE_ZSTD
+ curl_LDADD += -L"$(ZSTD_PATH)/lib" -lzstd
+endif
ifdef BROTLI
INCLUDES += -I"$(BROTLI_PATH)/include"
CFLAGS += -DHAVE_BROTLI |
52fa79b
to
cf2ab8f
Compare
zstd tests 396 and 397 have success 2020-05-27T02:48:29.8568226Z --p----e--- OK (357 out of 1342, remaining: 04:37, took 0.075s, duration: 01:40) |
@Lekensteyn you up for adding a cmake fix for zstd to add to this? |
@bagder Do you think last check fails are related to my modification? It's not easy to understand them... regards! |
No I don't think you can be blamed for them, they look like the regular set failures due to flaky CI. |
Added CMake support, tested on Arch Linux with zstd 1.4.4-1. I observed that the support for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of nits, looks good otherwise.
.azure-pipelines.yml
Outdated
env: | ||
AZURE_ACCESS_TOKEN: "$(System.AccessToken)" | ||
TFLAGS: "" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for adding a new job as opposed to combining it with another test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just I was afraid broken anything...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably reuse an existing job to reduce the carbon footprint, but it's up to @bagder :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @Lekensteyn, we should rather combine this into an existing job. Not only the carbon footprint wins, we also get the complete results done faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, but I prefer you choose the existing job to add zstd.
I must admit my focus was on zstd code, makefile and test and not the better CI solution... :-)
size_t errorCode; | ||
|
||
if(!zp->decomp) { | ||
zp->decomp = malloc(DSIZ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious on this malloc, and maybe @felixhandte can help us out here (I tried to find docs that would explain this but I failed). How is this particular memory block size chosen? Is there a benefit to use any particular size and if so, what's the trade-offs? DSIZ
is CURL_MAX_WRITE_SIZE
, 16K by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is size used by other decompressor (both brotli and zlib). I believed this was optimal size for Curl_unencode_write.
Zstd is happy with this size or other size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
16KB should be fine here.
Optimizing output buffer sizes for Zstd decompression is mostly a best effort sort of thing, and generally not performance critical. If Zstd gets a frame that it knows will be smaller than the provided output buffer, it can decompress straight into that buffer. But if you receive a streamed response (i.e., one that doesn't announce its total size in the Frame_Content_Size
field in the frame header), Zstd has to internally buffer the output anyways to maintain its history window. So the output buffer size doesn't really matter, we're just memcpy
-ing into it from that internal buffer that we are actually doing the decompression into.
Which, by the way, raises the topic of window sizes. libzstd
has a default limit of 128 MB, which means that the ZSTD_DCtx
may internally allocate up to ~128 MB, and anything that requires more will be rejected. This limit can be configured: I'm not sure whether curl wants to expose this to users, or whether curl wants to select a different limit by default. The RFC recommends for interoperability that, in the absence of negotiation, encoders not use a window larger than 8 MB:
For improved interoperability, it's recommended for decoders to
support values of Window_Size up to 8 MB and for encoders not to
generate frames requiring a Window_Size larger than 8 MB. It's
merely a recommendation though, and decoders are free to support
larger or lower limits, depending on local limitations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made minor decompression code to follow @Cyan4973 suggestion about exit loop condition
5bda517
to
17a62b4
Compare
2ac3e54
to
c7a258b
Compare
include zstd curl patch for Makefile.m32 from vszakats and include Add CMake support for zstd from Peter Wu
@bagder do you think we need more work/test on this feature? |
Thanks! |
There's zstd support now: curl/curl#5453 Since it is detected and enabled by default, let's declare it a dependency explicitly (so that `curl.exe` does not fail to start just because `libzstd.dll` is not found). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
option(CURL_ZSTD "Set to ON to enable building curl with zstd support." OFF) | ||
set(HAVE_ZSTD OFF) | ||
if(CURL_ZSTD) | ||
find_package(Zstd REQUIRED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably too late to ask, but why relying on a custom CMake module file, while zstd provides an official CMake config file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you think we can improve the cmake stuff for zstd, feel free to submit a pull-request!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I remember: zstd provides config file since 1.4.5 (released in May of this year), so it makes sense to have a custom module file in order to properly handle previous versions.
@bagder Yep, when I have time, I'll see if there is a way to first check for zstd config file (more robust), then fallback to custom module file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use the same technique CMake does for curl: https://gitlab.kitware.com/cmake/cmake/-/blob/39677de5e209445c8cbc5957c1e79088d5d2a03a/Modules/FindCURL.cmake
So in the find module first check if zstd can be found in config mode. If not, do custom search and provide imported targets consistent with what zstd config currently produces.
Zstd is a very fast compressor, and give better speed vs compression ratio than gzip/deflate or brotli.
This pull request add zstd support on curl and libcurl
to test
curl --compressed https://de-de.facebook.com/
curl --compressed https://de-de.facebook.com/ --verbose
Somes links:
https://www.iana.org/assignments/http-parameters/http-parameters.xhtml
https://datatracker.ietf.org/doc/draft-kucherawy-rfc8478bis/
https://tools.ietf.org/html/rfc8478
https://facebook.github.io/zstd/