Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi/socket: assertion failure when requesting h2 in parallel #4012

Closed
TvdW opened this issue Jun 11, 2019 · 9 comments
Closed

multi/socket: assertion failure when requesting h2 in parallel #4012

TvdW opened this issue Jun 11, 2019 · 9 comments

Comments

@TvdW
Copy link

TvdW commented Jun 11, 2019

Decided to give curl master a try to see if there were any major issues left after my recent issue reports. Found a new segfault though...

https://gist.github.com/TvdW/1a64e73c77ce5b695515c4493675ccbe

* STATE: DO_DONE => PERFORM handle 0x789318; line 1757 (connection #0)
a.out: multi.c:2535: multi_socket: Assertion `data->magic == 0xc0dedbadU' failed.
Program received signal SIGABRT, Aborted.

(gdb) bt full
[...]
#3  0x00007ffff715c312 in __assert_fail () from /lib64/libc.so.6
No symbol table info available.
#4  0x0000000000410511 in multi_socket (multi=0x70fcd8, checkall=false, s=7, ev_bitmask=1, running_handles=0x7fffffffdda4) at multi.c:2535
        iter = {hash = 0x78fb38, slot_index = 9, current_element = 0x7907d8}
        he = 0x7907d8
        pipe_st = {old_pipe_act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0, 4, 140737488345700, 4, 7931824, 140737488345680, 4539110, 7927584, 4, 140737488345700, 7404944, 140737488345728, 
                7931784, 7405656, 7931784, 7901976}}, sa_flags = 67108864, sa_restorer = 0x7ffff7163270 <__restore_rt>}, no_signal = false}
        entry = 0x78fb38
        result = CURLM_OK
        data = 0x72fc18
        t = 0x410f92 <Curl_update_timer+52>
        now = {tv_sec = 15208913, tv_usec = 863696}
        __PRETTY_FUNCTION__ = "multi_socket"
#5  0x0000000000410d60 in curl_multi_socket_action (multi=0x70fcd8, s=7, ev_bitmask=1, running_handles=0x7fffffffdda4) at multi.c:2705
        result = CURLM_OK
#6  0x0000000000406d3a in main (argc=1, argv=0x7fffffffded8) at test2.c:92
        crfds = {__fds_bits = {128, 0 <repeats 15 times>}}
        cwfds = {__fds_bits = {0 <repeats 16 times>}}
        count = 1
        i = 7
        running = 4
        multi = 0x70fcd8
        remaining = 192
curl 7.65.2-DEV (x86_64-unknown-linux-gnu) libcurl/7.65.2-DEV OpenSSL/1.0.2k-fips zlib/1.2.7 c-ares/1.15.0 nghttp2/1.38.0
Release-Date: [unreleased]
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp 
Features: AsynchDNS HTTP2 HTTPS-proxy IPv6 Largefile libz NTLM NTLM_WB SSL UnixSockets
@TvdW
Copy link
Author

TvdW commented Jun 11, 2019

Potentially related to #3991 (same assertion gets failed) but submitting as separate issue because the way to reproduce is very different.

@TvdW
Copy link
Author

TvdW commented Jun 11, 2019

Hm, I'm now also able to reproduce the bug by doing just a single request in parallel... very skeptical about this bug 😕

@bagder
Copy link
Member

bagder commented Jun 11, 2019

Thanks, with your example code I too hit the assert. I'm on it.

@TvdW
Copy link
Author

TvdW commented Jun 11, 2019

Whew, I thought I was going crazy (spent an hour double-checking, seemed too simple). Thanks!

bagder added a commit that referenced this issue Jun 11, 2019
- The transfer hashes weren't using the correct keys so removing entries
  failed.

- Simplified the iteration logic over transfers sharing the same socket and
  they now simply are set to expire and thus get handled in the "regular"
  timer loop instead.

Fixes #4012 (ideally)
@bagder
Copy link
Member

bagder commented Jun 11, 2019

Oh man what a brain malfunction. Stand by for PR.

@jay
Copy link
Member

jay commented Jun 12, 2019

@TvdW's test works with the fix in #4014. (To fully satisfy asan I had to add curl_multi_cleanup(multi); curl_global_cleanup();)

@jay
Copy link
Member

jay commented Jun 12, 2019

Probably doesn't solve #3991 I just got a busy loop after I ran the test repeatedly.

  • Connection still in use 1, no more multi_done now!

@bagder
Copy link
Member

bagder commented Jun 12, 2019

I could spot the still in use problem too occasionally, but I'm pretty sure it is a separate issue.

@bagder
Copy link
Member

bagder commented Jun 12, 2019

I think I'll proceed and land #4014 first and then we can continue and see if we can reproduce further problems and take them on, one by one. As usual.

@bagder bagder closed this as completed in 8b987cc Jun 12, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Sep 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging a pull request may close this issue.

3 participants