New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TODO 1.7: Detect when called from within callbacks #2302
Conversation
lib/urldata.h
Outdated
@@ -1750,6 +1750,7 @@ struct Curl_easy { | |||
iconv_t inbound_cd; /* for translating from the network encoding */ | |||
iconv_t utf8_cd; /* for translating to UTF8 */ | |||
#endif /* CURL_DOES_CONVERSIONS && HAVE_ICONV */ | |||
bool in_callback; /* true while executing a callback */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to see this added to the struct UrlState
, which is a sub-struct within Curl_easy. Simply because we tend to store state "things" there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
The CI build failure is because you forgot to add |
0d7172a
to
8ad26d4
Compare
Here's a first version I believe to be complete. I'm torn about the function names. One might also discuss the fact that API functions with a Oh, one more thing: I deliberately left out the recursivity check from |
Errors with picky compiler options :
|
Lovely, I like this. Two more things I'd like to see:
|
Thanks! |
It mitigates undefined behavior due to user error. (We can never completely "solve" that problem.) Some functions are not meant to be called within callbacks. |
@jay Could you please elaborate? What specific errors would you like to prevent by artificially making most of curl's functions non-reentrant? I assume, that the author of this PR wanted to "save" beginning developers from recursively calling In addition to banning actually dangerous functions, this PR makes it impossible to invoke functions, that do not themselves invoke callbacks, and even harmless read-only functions like curl_timeout and curl_fdset. This sounds like solution in search of problem. |
We've had users (yes plural) in the past call back into libcurl from callbacks with disastrous results since the libcurl internals are not built to allow that, leading to the destruction of internal data. This precaution is done to help users avoid shooting themselves in the foot. If there are functions you think we should allow to get called from within callbacks, then we should discuss explicitly allowing them, documenting that fact and preferably add tests for that so that we can make sure that it keeps working even in the future and not just by chance. |
@bagder Please consider, what actual problems may arise from such calls. I can think of two types of such problems:
Neither of those problems affect "pure" functions, such as These issues also do not affect functions like I agree, that having tests for those things would be great, however this issue can not be resolved by tests alone. The reason why people are trying to call curl_perform recursively does not lie in flawed coding practices, — they just misunderstood curl's contract. When I started working with curl's API I intuitively assumed, that pausing curl_easy actually, well… pauses it. When I discovered, that this isn't the case, that made me rather dissatisfied. The people, who have been recursively calling perform() and remove() from callbacks, naively assumed the same thing, — that curl's internal state can be though of as a simple read/write loop. This is a direct result of poor documentation (no, calling a complex, quirky, hard to customize wrapper "easy API" does not count as good documentation, even if your documentation explains everything in detail on page 956). Likewise, when people see an ordinary setter, that really should be implemented as simple pointer assignment (e.g. a curl_easy_setinfo() with READ_CALLBACK/WRITE_CALLBACK), they generally expect it to work as such, — including immediately using a new callback on next invocation. Even documenting the opposite won't make them happy, if function contract is counter-intuitive. TL; DR: overly defensive programming will not solve the underlying problem, — a poorly conveyed library contract. Until every method has a documentation, telling what exactly follows from calling it, people will still screw up, no matter how much hostile error checking you add. Speaking of which, POSIX and developers of C libraries explicitly document properties of each method (poority, singnal- and thread-safety etc.) even if they make zero efforts towards actually making methods thread- and async-signal-safe. |
There are several different types of errors when unexpected getting called back recursively. I recall segfaults that come from the second call doing things so that when the callback eventually returns, the internal state "uncertain" leading to crashes or other undefined behaviors that are hard (for users) to understand.
Again: I'm perfectly okay with discussing and making specific functions allowed to be used from within the callbacks. We already allow some that we thought of immediately. If there are more, we should fix those. And document this fact.
Tests basically never "resolve" things, but they're excellent in making sure that we don't accidentally break things again down the line when we change the code for other reasons.
It does pause it. I don't understand in which way you could pause a transfer in any other way than what the current pausing does.
If you're going to get into insults and just be cranky then I'll just stop this discussion because that's nothing more than rude and shows your complete lack of understanding of what we're trying to do here and the efforts it takes. We have transfer library that is ultimately used by billions of people daily, we get a huge amount of bug reports and improvements over time and we try to keep the API and ABI really stable. We work our asses off to make this happen and to document all the details of the API. Do we slip sometimes and do mistakes? Absolutely - and I'm sure most of them are done by me personally. But the way to handle those mistakes is not to whine and write sarcastic comments in the bug tracker. The way forward is to suggest what we should do next to improve the situation we have now. So, do you want to complain or do you want to fix things? |
Here's a proof-of-concept for a simple mechanism to protect against recursive api calls from within callbacks.
Before I apply this method to more callbacks and api calls, am I missing something or is this a good enough solution?