Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding zstd decoding support #5453

Closed
wants to merge 1 commit into from
Closed

Adding zstd decoding support #5453

wants to merge 1 commit into from

Conversation

gvollant
Copy link
Contributor

Zstd is a very fast compressor, and give better speed vs compression ratio than gzip/deflate or brotli.

This pull request add zstd support on curl and libcurl

to test
curl --compressed https://de-de.facebook.com/
curl --compressed https://de-de.facebook.com/ --verbose

Somes links:
https://www.iana.org/assignments/http-parameters/http-parameters.xhtml

https://datatracker.ietf.org/doc/draft-kucherawy-rfc8478bis/

https://tools.ietf.org/html/rfc8478

https://facebook.github.io/zstd/

@gvollant
Copy link
Contributor Author

Under Ubuntu, we can install zstd library with apt install libzstd-dev

Can anyone help me integrate a zstd test in the curl test system ?

@gvollant
Copy link
Contributor Author

To install a server with zstd support
https://github.com/tokers/zstd-nginx-module

@bagder
Copy link
Member

bagder commented May 25, 2020

Can anyone help me integrate a zstd test in the curl test system ?

See test 1170 for inspiration. You basically only need a binary dump of what the server should respond, then put that base64-encoded in the <data> section.

@gvollant
Copy link
Contributor Author

I can, by example, translare the three brotli test (314/315/316), but add a build configuration with zstd (probably an Ubuntu with libzstd-dev package) and, configure the test only for zstd build is not easy for me

@bagder
Copy link
Member

bagder commented May 25, 2020

I'll write up a configure check for you...

@bagder
Copy link
Member

bagder commented May 25, 2020

I put it in a separate branch for you. Try it out and see if it works out.

docs/libcurl/curl_version_info.3 Show resolved Hide resolved
#ifdef HAVE_ZSTD
/* Writer parameters. */
struct zstd_params {
ZSTD_DStream *zds; /* State structure for zstd. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not curl indent-level! (and more of that later but I won't repeat myself)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I'll have to find a solution with visual studio 2019 and curl rules

lib/content_encoding.c Outdated Show resolved Hide resolved
ZSTD_outBuffer out;
size_t errorCode;

if(zp->decomp == NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we typically use if(!zp->decomp) instead of checking for NULL

lib/content_encoding.c Show resolved Hide resolved
lib/content_encoding.c Outdated Show resolved Hide resolved
docs/libcurl/symbols-in-versions Outdated Show resolved Hide resolved
@gvollant gvollant force-pushed the zstd branch 5 times, most recently from ce5f499 to c57930c Compare May 26, 2020 06:21
@gvollant
Copy link
Contributor Author

I put it in a separate branch for you. Try it out and see if it works out.

I rebased my work on your branch. Do you prefer I just copy your file and rebase on master?

@bagder
Copy link
Member

bagder commented May 26, 2020

It is probably easiest if you just copy (cherry-pick) my commit into your branch

@gvanem
Copy link
Contributor

gvanem commented May 26, 2020

I tried your patch on Windows/MSVC-2019 and it works fine.
I've not much to compare with, but the above facebook link transferred fast!

@gvollant
Copy link
Contributor Author

I added libzstd-dev package on an ubuntu focal build on .travis.yml but I'm unable to find it on all the continuous integration build.

@felixhandte
Copy link

Hey! I'm from the zstd team (and I implemented the support for the zstd content-coding for facebook.com).

I'm really excited to see zstd make its way into curl! Let me know if you have any questions.

@bagder
Copy link
Member

bagder commented May 26, 2020

I added libzstd-dev package on an ubuntu focal build on .travis.yml but I'm unable to find it on all the continuous integration build.

Do you have travis enabled for your repo? Looking at your repo it doesn't look like so...

Ah strike that, it should build here...

@bagder
Copy link
Member

bagder commented May 26, 2020

You can see configure detect and enabled it here - so that much seems to work. But note that you added that to the build that explicitly does not run the test suite.

@gvollant gvollant force-pushed the zstd branch 6 times, most recently from 776840d to 51b05d8 Compare May 26, 2020 23:09
@vszakats
Copy link
Member

vszakats commented May 26, 2020

I'm attaching a patch to add zstd support to the MinGW aka Makefile.m32 build method. It's an untested, but fairly simple modification, and should work together with a pending patch that adds zstd builds to curl-for-win. The latter has been tested successfully earlier today on a local machine:

zstd-m32.patch.txt

patch

diff --git a/docs/examples/Makefile.m32 b/docs/examples/Makefile.m32
index d0694cfa3..e8c0d376f 100644
--- a/docs/examples/Makefile.m32
+++ b/docs/examples/Makefile.m32
@@ -24,7 +24,7 @@
 #
 ## Makefile for building curl examples with MingW (GCC-3.2 or later)
 ## and optionally OpenSSL (1.0.2a), libssh2 (1.5), zlib (1.2.8), librtmp (2.4),
-## brotli (1.0.1)
+## brotli (1.0.1), zstd (1.4.5)
 ##
 ## Usage:   mingw32-make -f Makefile.m32 CFG=-feature1[-feature2][-feature3][...]
 ## Example: mingw32-make -f Makefile.m32 CFG=-zlib-ssl-sspi-winidn
@@ -39,6 +39,10 @@
 ifndef ZLIB_PATH
 ZLIB_PATH = ../../../zlib-1.2.8
 endif
+# Edit the path below to point to the base of your Zstandard sources.
+ifndef ZSTD_PATH
+ZSTD_PATH = ../../../zstd-1.4.5
+endif
 # Edit the path below to point to the base of your Brotli sources.
 ifndef BROTLI_PATH
 BROTLI_PATH = ../../../brotli-1.0.1
@@ -180,6 +184,9 @@ endif
 ifeq ($(findstring -zlib,$(CFG)),-zlib)
 ZLIB = 1
 endif
+ifeq ($(findstring -zstd,$(CFG)),-zstd)
+ZSTD = 1
+endif
 ifeq ($(findstring -brotli,$(CFG)),-brotli)
 BROTLI = 1
 endif
@@ -284,6 +291,11 @@ ifdef ZLIB
   CFLAGS += -DHAVE_LIBZ -DHAVE_ZLIB_H
   curl_LDADD += -L"$(ZLIB_PATH)" -lz
 endif
+ifdef ZSTD
+  INCLUDES += -I"$(ZSTD_PATH)/include"
+  CFLAGS += -DHAVE_ZSTD
+  curl_LDADD += -L"$(ZSTD_PATH)/lib" -lzstd
+endif
 ifdef BROTLI
   INCLUDES += -I"$(BROTLI_PATH)/include"
   CFLAGS += -DHAVE_BROTLI
diff --git a/lib/Makefile.m32 b/lib/Makefile.m32
index fe8701bdb..02b31106c 100644
--- a/lib/Makefile.m32
+++ b/lib/Makefile.m32
@@ -24,7 +24,7 @@
 #
 ## Makefile for building libcurl.a with MingW (GCC-3.2 or later or LLVM/Clang)
 ## and optionally OpenSSL (1.0.2a), libssh2 (1.5), zlib (1.2.8), librtmp (2.4),
-## brotli (1.0.1)
+## brotli (1.0.1), zstd (1.4.5)
 ##
 ## Usage:   mingw32-make -f Makefile.m32 CFG=-feature1[-feature2][-feature3][...]
 ## Example: mingw32-make -f Makefile.m32 CFG=-zlib-ssl-sspi-winidn
@@ -39,6 +39,10 @@
 ifndef ZLIB_PATH
 ZLIB_PATH = ../../zlib-1.2.8
 endif
+# Edit the path below to point to the base of your Zstandard sources.
+ifndef ZSTD_PATH
+ZSTD_PATH = ../../zstd-1.4.5
+endif
 # Edit the path below to point to the base of your Brotli sources.
 ifndef BROTLI_PATH
 BROTLI_PATH = ../../brotli-1.0.1
@@ -180,6 +184,9 @@ endif
 ifeq ($(findstring -zlib,$(CFG)),-zlib)
 ZLIB = 1
 endif
+ifeq ($(findstring -zstd,$(CFG)),-zstd)
+ZSTD = 1
+endif
 ifeq ($(findstring -brotli,$(CFG)),-brotli)
 BROTLI = 1
 endif
@@ -288,6 +295,11 @@ ifdef ZLIB
   CFLAGS += -DHAVE_LIBZ -DHAVE_ZLIB_H
   DLL_LIBS += -L"$(ZLIB_PATH)" -lz
 endif
+ifdef ZSTD
+  INCLUDES += -I"$(ZSTD_PATH)/include"
+  CFLAGS += -DHAVE_ZSTD
+  DLL_LIBS += -L"$(ZSTD_PATH)/lib" -lzstd
+endif
 ifdef BROTLI
   INCLUDES += -I"$(BROTLI_PATH)/include"
   CFLAGS += -DHAVE_BROTLI
diff --git a/src/Makefile.m32 b/src/Makefile.m32
index 802a1da0f..afb4fd547 100644
--- a/src/Makefile.m32
+++ b/src/Makefile.m32
@@ -24,7 +24,7 @@
 #
 ## Makefile for building curl.exe with MingW (GCC-3.2 or later or LLVM/Clang)
 ## and optionally OpenSSL (1.0.2a), libssh2 (1.5), zlib (1.2.8), librtmp (2.4),
-## brotli (1.0.1)
+## brotli (1.0.1), zstd (1.4.5)
 ##
 ## Usage:   mingw32-make -f Makefile.m32 CFG=-feature1[-feature2][-feature3][...]
 ## Example: mingw32-make -f Makefile.m32 CFG=-zlib-ssl-sspi-winidn
@@ -39,6 +39,10 @@
 ifndef ZLIB_PATH
 ZLIB_PATH = ../../zlib-1.2.8
 endif
+# Edit the path below to point to the base of your Zstandard sources.
+ifndef ZSTD_PATH
+ZSTD_PATH = ../../zstd-1.4.5
+endif
 # Edit the path below to point to the base of your Brotli sources.
 ifndef BROTLI_PATH
 BROTLI_PATH = ../../brotli-1.0.1
@@ -189,6 +193,9 @@ endif
 ifeq ($(findstring -zlib,$(CFG)),-zlib)
 ZLIB = 1
 endif
+ifeq ($(findstring -zstd,$(CFG)),-zstd)
+ZSTD = 1
+endif
 ifeq ($(findstring -brotli,$(CFG)),-brotli)
 BROTLI = 1
 endif
@@ -302,6 +309,11 @@ ifdef ZLIB
   CFLAGS += -DHAVE_LIBZ -DHAVE_ZLIB_H
   curl_LDADD += -L"$(ZLIB_PATH)" -lz
 endif
+ifdef ZSTD
+  INCLUDES += -I"$(ZSTD_PATH)/include"
+  CFLAGS += -DHAVE_ZSTD
+  curl_LDADD += -L"$(ZSTD_PATH)/lib" -lzstd
+endif
 ifdef BROTLI
   INCLUDES += -I"$(BROTLI_PATH)/include"
   CFLAGS += -DHAVE_BROTLI

@gvollant gvollant force-pushed the zstd branch 2 times, most recently from 52fa79b to cf2ab8f Compare May 27, 2020 02:43
@gvollant
Copy link
Contributor Author

zstd tests 396 and 397 have success
https://github.com/curl/curl/pull/5453/checks?check_run_id=711600257

https://dev.azure.com/daniel0244/curl/_build/results?buildId=2039&view=logs&jobId=8a21621c-000d-5b56-b920-3535c5db3dfc&j=8a21621c-000d-5b56-b920-3535c5db3dfc&t=7573036f-f328-5f4c-a599-58f1a97b0344

2020-05-27T02:48:29.8568226Z --p----e--- OK (357 out of 1342, remaining: 04:37, took 0.075s, duration: 01:40)
2020-05-27T02:48:29.8569333Z test 0396...[HTTP GET zstd compressed content]
2020-05-27T02:48:29.9184975Z --pd---e--- OK (358 out of 1342, remaining: 04:36, took 0.061s, duration: 01:40)
2020-05-27T02:48:29.9186066Z test 0397...[HTTP GET zstd compressed content of size more than CURL_MAX_WRITE_SIZE]
2020-05-27T02:48:29.9611994Z --pd---e--- OK (359 out of 1342, remaining: 04:35, took 0.035s, duration: 01:40)

@bagder
Copy link
Member

bagder commented May 27, 2020

@Lekensteyn you up for adding a cmake fix for zstd to add to this?

@gvollant
Copy link
Contributor Author

@bagder Do you think last check fails are related to my modification? It's not easy to understand them...

regards!

@bagder
Copy link
Member

bagder commented May 27, 2020

No I don't think you can be blamed for them, they look like the regular set failures due to flaky CI.

@Lekensteyn
Copy link
Contributor

Added CMake support, tested on Arch Linux with zstd 1.4.4-1.

I observed that the support for ZSTD_createDStream was added for the first time in zstd v0.8.1 (facebook/zstd@5a0c8e2), so the symbol check is indeed necessary. Otherwise curl could potentially fail to build on Ubuntu 16.04.

Copy link
Contributor

@Lekensteyn Lekensteyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of nits, looks good otherwise.

tests/runtests.pl Outdated Show resolved Hide resolved
.travis.yml Outdated Show resolved Hide resolved
.travis.yml Outdated Show resolved Hide resolved
env:
AZURE_ACCESS_TOKEN: "$(System.AccessToken)"
TFLAGS: ""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for adding a new job as opposed to combining it with another test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just I was afraid broken anything...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably reuse an existing job to reduce the carbon footprint, but it's up to @bagder :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @Lekensteyn, we should rather combine this into an existing job. Not only the carbon footprint wins, we also get the complete results done faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, but I prefer you choose the existing job to add zstd.

I must admit my focus was on zstd code, makefile and test and not the better CI solution... :-)

size_t errorCode;

if(!zp->decomp) {
zp->decomp = malloc(DSIZ);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious on this malloc, and maybe @felixhandte can help us out here (I tried to find docs that would explain this but I failed). How is this particular memory block size chosen? Is there a benefit to use any particular size and if so, what's the trade-offs? DSIZ is CURL_MAX_WRITE_SIZE, 16K by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is size used by other decompressor (both brotli and zlib). I believed this was optimal size for Curl_unencode_write.
Zstd is happy with this size or other size.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

16KB should be fine here.

Optimizing output buffer sizes for Zstd decompression is mostly a best effort sort of thing, and generally not performance critical. If Zstd gets a frame that it knows will be smaller than the provided output buffer, it can decompress straight into that buffer. But if you receive a streamed response (i.e., one that doesn't announce its total size in the Frame_Content_Size field in the frame header), Zstd has to internally buffer the output anyways to maintain its history window. So the output buffer size doesn't really matter, we're just memcpy-ing into it from that internal buffer that we are actually doing the decompression into.

Which, by the way, raises the topic of window sizes. libzstd has a default limit of 128 MB, which means that the ZSTD_DCtx may internally allocate up to ~128 MB, and anything that requires more will be rejected. This limit can be configured: I'm not sure whether curl wants to expose this to users, or whether curl wants to select a different limit by default. The RFC recommends for interoperability that, in the absence of negotiation, encoders not use a window larger than 8 MB:

For improved interoperability, it's recommended for decoders to
support values of Window_Size up to 8 MB and for encoders not to
generate frames requiring a Window_Size larger than 8 MB. It's
merely a recommendation though, and decoders are free to support
larger or lower limits, depending on local limitations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made minor decompression code to follow @Cyan4973 suggestion about exit loop condition

@gvollant gvollant force-pushed the zstd branch 2 times, most recently from 5bda517 to 17a62b4 Compare May 31, 2020 12:17
@bagder bagder added the feature-window A merge of this requires an open feature window label Jun 6, 2020
@gvollant gvollant force-pushed the zstd branch 3 times, most recently from 2ac3e54 to c7a258b Compare June 25, 2020 00:04
include zstd curl patch for Makefile.m32 from vszakats
and include Add CMake support for zstd from Peter Wu
@gvollant
Copy link
Contributor Author

@bagder do you think we need more work/test on this feature?

@bagder bagder closed this in e13357b Jul 12, 2020
@bagder
Copy link
Member

bagder commented Jul 12, 2020

Thanks!

@gvollant gvollant deleted the zstd branch July 12, 2020 16:19
dscho added a commit to git-for-windows/MINGW-packages that referenced this pull request Sep 24, 2020
There's zstd support now: curl/curl#5453

Since it is detected and enabled by default, let's declare it a
dependency explicitly (so that `curl.exe` does not fail to start just
because `libzstd.dll` is not found).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
option(CURL_ZSTD "Set to ON to enable building curl with zstd support." OFF)
set(HAVE_ZSTD OFF)
if(CURL_ZSTD)
find_package(Zstd REQUIRED)
Copy link

@SpaceIm SpaceIm Sep 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably too late to ask, but why relying on a custom CMake module file, while zstd provides an official CMake config file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think we can improve the cmake stuff for zstd, feel free to submit a pull-request!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I remember: zstd provides config file since 1.4.5 (released in May of this year), so it makes sense to have a custom module file in order to properly handle previous versions.

@bagder Yep, when I have time, I'll see if there is a way to first check for zstd config file (more robust), then fallback to custom module file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use the same technique CMake does for curl: https://gitlab.kitware.com/cmake/cmake/-/blob/39677de5e209445c8cbc5957c1e79088d5d2a03a/Modules/FindCURL.cmake

So in the find module first check if zstd can be found in config mode. If not, do custom search and provide imported targets consistent with what zstd config currently produces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-window A merge of this requires an open feature window
Development

Successfully merging this pull request may close these issues.

None yet

8 participants