cURL / Mailing Lists / curl-users / Single Mail

curl-users

Re: Curl --data-urlencode posts broken non-English characters

From: Daniel Stenberg <daniel_at_haxx.se>
Date: Wed, 3 Feb 2016 23:58:12 +0100 (CET)

On Wed, 3 Feb 2016, Irvin Jacob wrote:

> I have isolated the issue. It seems that the "@" functionality for options
> -d/-data and --data-urlencode always interprets Unicode characters with
> decimal range 128 and higher as UTF-8 even if I set the Content-Type:
> application/x-www-form-urlencoded charset header that Curl sends to
> something else.

curl is actually not charset aware at all. It will happily encode the exact
byte values it reads into percent encoded seqences.

> Curl should have support for the conversion of special Unicode characters
> (decimal range 128 and up) to numeric HTML entities via the "@file"
> interpretation, leave alone ASCII characters ranging from 0-127, and
> URL-encode ONLY safe non-printing characters like horizontal tab, newline,
> carriage return etc.

Sorry, but that won't fix this problem for you. The problem is that you pass
in the contents using a specific charset but you want it passed on using
another one. You'd then need curl to convert between them and I don't think
that is curl's job to do. The url encode function already do more or less what
you ask for.

I think you can do what you want by using the iconv tool.

-- 
  / daniel.haxx.se
-------------------------------------------------------------------
List admin: https://cool.haxx.se/list/listinfo/curl-users
FAQ:        https://curl.haxx.se/docs/faq.html
Etiquette:  https://curl.haxx.se/mail/etiquette.html
Received on 2016-02-03