TODO 1.7: Detect when called from within callbacks #2302

zagor · 2018-02-10T14:21:32Z

Here's a proof-of-concept for a simple mechanism to protect against recursive api calls from within callbacks.

Before I apply this method to more callbacks and api calls, am I missing something or is this a good enough solution?

bagder · 2018-02-10T18:54:42Z

lib/urldata.h

@@ -1750,6 +1750,7 @@ struct Curl_easy {
  iconv_t inbound_cd;          /* for translating from the network encoding */
  iconv_t utf8_cd;             /* for translating to UTF8 */
 #endif /* CURL_DOES_CONVERSIONS && HAVE_ICONV */
+  bool in_callback;            /* true while executing a callback */


I would prefer to see this added to the struct UrlState, which is a sub-struct within Curl_easy. Simply because we tend to store state "things" there.

bagder · 2018-02-10T18:56:03Z

The CI build failure is because you forgot to add CURLE_RECURSIVE_API_CALL to docs/libcurl/symbols-in-versions which makes test 1119 fail.

zagor · 2018-02-10T23:18:55Z

Here's a first version I believe to be complete. I'm torn about the function names. Curl_set_in_callback() describes what the code does but not why. Would something like Curl_lock_api() be a better name?

One might also discuss the fact that API functions with a Curl_easy pointer use the Curl_is_in_callback() access function, while functions with a Curl_multi pointer look at the in_callback bool directly. I guessed you don't want functions for single statement code like that, but correct me if I'm wrong.

Oh, one more thing: I deliberately left out the recursivity check from curl_easy_getinfo() since I have a hunch it might be used by more than a few people and as far as I can tell should be farily benign to call from callbacks.

bagder · 2018-02-11T09:01:03Z

Errors with picky compiler options :

multi.c: In function ‘Curl_is_in_callback’:
multi.c:3107:23: error: suggest parentheses around ‘&&’ within ‘||’ [-Werror=parentheses]
   return (easy->multi && easy->multi->in_callback ||
                       ^

bagder · 2018-02-11T12:45:22Z

Lovely, I like this. Two more things I'd like to see:

add the new error codes to the libcurl-errors.3 man page (I don't think we have any test - yet - that detects this)
a test case that verifies that this works, at least for some combination of callback/function.

bagder · 2018-02-15T08:38:38Z

Thanks!

Alexander-- · 2018-04-06T18:47:48Z

@bagder @zagor Could you please explain the purpose of this feature? What problem does it solve?

jay · 2018-04-06T18:59:57Z

It mitigates undefined behavior due to user error. (We can never completely "solve" that problem.) Some functions are not meant to be called within callbacks.

Alexander-- · 2018-04-06T19:19:44Z

@jay Could you please elaborate? What specific errors would you like to prevent by artificially making most of curl's functions non-reentrant?

I assume, that the author of this PR wanted to "save" beginning developers from recursively calling curl_XXX_perform. But in process he has aggressively banned most curl functions from being invoked by callbacks for reasons, not entirely clear to me.

In addition to banning actually dangerous functions, this PR makes it impossible to invoke functions, that do not themselves invoke callbacks, and even harmless read-only functions like curl_timeout and curl_fdset. This sounds like solution in search of problem.

bagder · 2018-04-06T21:39:10Z

We've had users (yes plural) in the past call back into libcurl from callbacks with disastrous results since the libcurl internals are not built to allow that, leading to the destruction of internal data. This precaution is done to help users avoid shooting themselves in the foot.

If there are functions you think we should allow to get called from within callbacks, then we should discuss explicitly allowing them, documenting that fact and preferably add tests for that so that we can make sure that it keeps working even in the future and not just by chance.

Alexander-- · 2018-04-06T23:15:57Z

@bagder Please consider, what actual problems may arise from such calls. I can think of two types of such problems:

The function recursively invokes the same callback, leading to endless recursion. For example, a header callback may attempt to invoke curl_perform, which in turn may try to invoke the callback again.
The function prematurely releases resources, because it assumes, that caller no longer needs them. For example, a header callback may attempt to invoke curl_perform, which notices, that header callback was already invoked and deallocates header buffers, leading to segfault upon return.

Neither of those problems affect "pure" functions, such as curl_multi_fdset and curl_multi_timeout. Since those functions do not modify any shared data, it is impossible for inconsistencies to arise from calling them.

These issues also do not affect functions like curl_multi_assign, which operate on shared data that is never exposed to user. The function amounts to single atomic assignment (from the viewpoint of caller), so it is just as recursion-safe as one.

I agree, that having tests for those things would be great, however this issue can not be resolved by tests alone. The reason why people are trying to call curl_perform recursively does not lie in flawed coding practices, — they just misunderstood curl's contract.

When I started working with curl's API I intuitively assumed, that pausing curl_easy actually, well… pauses it. When I discovered, that this isn't the case, that made me rather dissatisfied. The people, who have been recursively calling perform() and remove() from callbacks, naively assumed the same thing, — that curl's internal state can be though of as a simple read/write loop. This is a direct result of poor documentation (no, calling a complex, quirky, hard to customize wrapper "easy API" does not count as good documentation, even if your documentation explains everything in detail on page 956).

Likewise, when people see an ordinary setter, that really should be implemented as simple pointer assignment (e.g. a curl_easy_setinfo() with READ_CALLBACK/WRITE_CALLBACK), they generally expect it to work as such, — including immediately using a new callback on next invocation. Even documenting the opposite won't make them happy, if function contract is counter-intuitive.

TL; DR: overly defensive programming will not solve the underlying problem, — a poorly conveyed library contract. Until every method has a documentation, telling what exactly follows from calling it, people will still screw up, no matter how much hostile error checking you add. Speaking of which, POSIX and developers of C libraries explicitly document properties of each method (poority, singnal- and thread-safety etc.) even if they make zero efforts towards actually making methods thread- and async-signal-safe.

bagder · 2018-04-07T09:37:48Z

There are several different types of errors when unexpected getting called back recursively. I recall segfaults that come from the second call doing things so that when the callback eventually returns, the internal state "uncertain" leading to crashes or other undefined behaviors that are hard (for users) to understand.

Neither of those problems affect "pure" functions

Again: I'm perfectly okay with discussing and making specific functions allowed to be used from within the callbacks. We already allow some that we thought of immediately. If there are more, we should fix those. And document this fact.

this issue can not be resolved by tests alone

Tests basically never "resolve" things, but they're excellent in making sure that we don't accidentally break things again down the line when we change the code for other reasons.

When I started working with curl's API I intuitively assumed, that pausing curl_easy actually, well… pauses it.

It does pause it. I don't understand in which way you could pause a transfer in any other way than what the current pausing does.

This is a direct result of poor documentation (no, calling a complex, quirky, hard to customize wrapper "easy API" does not count as good documentation, even if your documentation explains everything in detail on page 956).

If you're going to get into insults and just be cranky then I'll just stop this discussion because that's nothing more than rude and shows your complete lack of understanding of what we're trying to do here and the efforts it takes.

We have transfer library that is ultimately used by billions of people daily, we get a huge amount of bug reports and improvements over time and we try to keep the API and ABI really stable. We work our asses off to make this happen and to document all the details of the API.

Do we slip sometimes and do mistakes? Absolutely - and I'm sure most of them are done by me personally. But the way to handle those mistakes is not to whine and write sarcastic comments in the bug tracker. The way forward is to suggest what we should do next to improve the situation we have now.

So, do you want to complain or do you want to fix things?

bagder reviewed Feb 10, 2018

View reviewed changes

TODO 1.7: Detect when called from within callbacks

8ad26d4

zagor force-pushed the protect-callbacks branch from 0d7172a to 8ad26d4 Compare February 10, 2018 23:05

zagor added 2 commits February 11, 2018 10:53

Code style adjustments

754e9b7

Add CURLM_RECURSIVE_API_CALL and use that for multi APIs

9c59574

zagor added 7 commits February 11, 2018 20:38

Fix checksrc remarks

12452ba

Update test 1538 expected output

778572e

Remove recursion lock from curl_easy_pause()

7f10a05

Remove recursion lock from curl_easy_setopt()

31b0142

Add test of api recursion protection

ea7379a

Document recursion lock error codes.

6ee07a3

Move declaration of 'res'

6488715

bagder closed this in b46cfbc Feb 15, 2018

lock bot locked as resolved and limited conversation to collaborators Jul 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TODO 1.7: Detect when called from within callbacks #2302

TODO 1.7: Detect when called from within callbacks #2302

zagor commented Feb 10, 2018

bagder Feb 10, 2018

zagor Feb 10, 2018

bagder commented Feb 10, 2018

zagor commented Feb 10, 2018 •

edited

bagder commented Feb 11, 2018

bagder commented Feb 11, 2018

bagder commented Feb 15, 2018

Alexander-- commented Apr 6, 2018

jay commented Apr 6, 2018

Alexander-- commented Apr 6, 2018 •

edited

bagder commented Apr 6, 2018

Alexander-- commented Apr 6, 2018

bagder commented Apr 7, 2018

TODO 1.7: Detect when called from within callbacks #2302

TODO 1.7: Detect when called from within callbacks #2302

Conversation

zagor commented Feb 10, 2018

bagder Feb 10, 2018

Choose a reason for hiding this comment

zagor Feb 10, 2018

Choose a reason for hiding this comment

bagder commented Feb 10, 2018

zagor commented Feb 10, 2018 • edited

bagder commented Feb 11, 2018

bagder commented Feb 11, 2018

bagder commented Feb 15, 2018

Alexander-- commented Apr 6, 2018

jay commented Apr 6, 2018

Alexander-- commented Apr 6, 2018 • edited

bagder commented Apr 6, 2018

Alexander-- commented Apr 6, 2018

bagder commented Apr 7, 2018

zagor commented Feb 10, 2018 •

edited

Alexander-- commented Apr 6, 2018 •

edited