curl / Mailing Lists / curl-library / Single Mail

curl-library

Re: "URLs are dangerous things"

From: bch <brad.harder_at_gmail.com>
Date: Thu, 08 Feb 2018 17:17:11 +0000

On Thu, Feb 8, 2018 at 8:58 AM Daniel Stenberg <daniel_at_haxx.se> wrote:

> On Thu, 8 Feb 2018, Dennis Clarke wrote:
>
> > There is nothing wrong with RFC-3986 nor the more specific RFC-8089.
>
> RFC 3986 is for generic URIs. RFC 8089 is for the specific subset file:
> URIs.
> They're different beasts.
>
> The "wrong" about 3986 is that people and software are more and more often
> using URLs that violate that spec now.
>
> > The very fact that WHATWG is very browser focused causes me to ignore
> > whatever they are doing.
>
> I too am tempted to take that stand, but unfortunately I don't think that
> benefits our users much.
>
> We occasionally see URLs being used on the web on the wild that "work in my
> browser" but they don't work in curl. They end up curl's problem either by
> users copying the URLs from the browser's address bar, users doing "copy
> link"
> or simply when asking curl to follow HTTP redirects - and more.
>
> Over time we've (reluctantly) added adaptions when curl users have
> suffered.
> We now handle one, two or three slashes after the "scheme:" part, we
> url-encode illegal letters in redirect URLs (since people actually send
> such
> and the browsers deal with them) and so on. And I suspect we've not seen
> the
> end of those compromises.
>

Is there a way to see what “quirks” have been applied to URLs ? It’d be
illustrative to see or retrieve info that says: “cURL adapted for
scheme/slash count”, or “automatic encoding employed”...

> URLs are not scoped to work within browsers *or* non-browsers. They work
> seamlessly across the entire Internet. They worked 20 years ago and I'm
> willing to bet they'll exist in another 20 years as well. The question is
> only
> exactly how to parse them... I think we as a community suffers as long as
> there isn't a one true URL spec.
>
> I also work on a separate document where I try to nail down exactly what
> differences there are between the two - three primary URL specs:
>
> https://github.com/bagder/docs/blob/master/URL-interop.md
>
> --
>
> / daniel.haxx.se
> -------------------------------------------------------------------
> Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
> Etiquette: https://curl.haxx.se/mail/etiquette.html

-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2018-02-08