Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conn vs data cleanup #3442

Closed
wants to merge 1 commit into from
Closed

conn vs data cleanup #3442

wants to merge 1 commit into from

Conversation

bagder
Copy link
Member

@bagder bagder commented Jan 5, 2019

Easy handles and connections

History

A common pattern in the libcurl code is having connections (struct
connectdata) point to its corresponding transfer (easy handle) using the
conn->data struct member.

We therefore pass around the 'conn' pointer only in a lot of places as we can
defer the transfer from it (using conn->data).

This made perfect sense back in the days when each transfer had a single
connection for the duration of a transfer and became less good when we added
Pipelining and downright ugly when we added support for multiplexed
connections.

Now, several simultaneous transfers can use the same connection so conn->data
has to be updated very frequently to make it accurate for every possible
situation. This is error-prone and is repeteadly a reason for issues and bugs.

Two transfers using two connections

    /------------\          /--------------\
    | Transfer 1 |   ===>   | Connection 1 |
    | 'data'     |          \--------------/
    \------------/

    /------------\
    | Transfer 2 |          /--------------\
    | 'data'     |   ===>   | Connection 2 |
    \------------/          \--------------/

Two transfers using one connection

    /------------\          /--------------\
    | Transfer 1 |   ===>   | Connection 1 |
    | 'data'     |     -->  \--------------/
    \------------/    /
                     /
    /------------\  /
    | Transfer 2 | /
    | 'data'     |
    \------------/

A better way

A transfer however only uses a single connection for a transfer (or none at
all) at any given time. data->conn gets assigned when a connection is setup
and it gets cleared when the connection is again closed or returned to the
connection pool. Several transfers can point to the same connection while they
multiplex over the same one.

Going forward

Use of 'conn->data' should be removed over time in preference to 'data->conn'.
Function prototypes should rather work with struct Curl_easy *data as the
identifier poiting out the specific transfer and 'data->conn' to get the
connection (which then is NULL if there is no associated connection for this
transfer), rather than the old style that passes in struct connectdata *conn
to most functions.

  • Curl_attach_connnection() - this is the new function that gets called when
    a connection is "attached" to a transfer. When the two gets their
    association. The association is then kept until detached at a later point.

  • Curl_detach_connnection() - this is the new function that gets called to
    separate the tranfer from its associated connection. From this point on,
    the transfer has no longer any associated connection.

Gradual, not single-shot

I previously made an attempt in making this change (basically removing
conn->data) in a single shot, a single pull-request, but even before it was
completed the changeset reached over 3000 modified lines. Quite simply too
large to be reviewable and landable in one atomic change.

That experience made me draw the conclusion that we need to make this change
gradually and over time.

Over time, data->conn should be used and conn->data use should be reduced.

@jay
Copy link
Member

jay commented Jan 7, 2019

I think it will probably be a bit rocky regardless. As it is it looks fine to me however CI re Curl_connect create_conn does not always allocate for conn.

@bagder
Copy link
Member Author

bagder commented Jan 7, 2019

I think it will probably be a bit rocky regardless.

Probably, yes. But I still think doing it gradually instead of a huge single merge will make a softer landing.

As it is it looks fine to me however CI re Curl_connect create_conn does not always allocate for conn.

Clearly some more fixes are needed to make the builds green...

@bagder bagder force-pushed the bagder/conn-data-cleanup branch 2 times, most recently from a60166a to 4dcc400 Compare January 9, 2019 14:16
bagder added a commit that referenced this pull request Jan 9, 2019
We use "conn" everywhere to be a pointer to the connection.

Introduces two functions that "attaches" and "detaches" the connection
to and from the transfer.

Going forward, we should favour using "data->conn" (since a transfer
always only has a single connection or none at all) to "conn->data"
(since a connection can have none, one or many transfers associated with
it and updating conn->data to be correct is error prone and a frequent
reason for internal issues).

Closes #3442
bagder added a commit that referenced this pull request Jan 10, 2019
We use "conn" everywhere to be a pointer to the connection.

Introduces two functions that "attaches" and "detaches" the connection
to and from the transfer.

Going forward, we should favour using "data->conn" (since a transfer
always only has a single connection or none at all) to "conn->data"
(since a connection can have none, one or many transfers associated with
it and updating conn->data to be correct is error prone and a frequent
reason for internal issues).

Side-effect: this also removes some old pipelining logic that isn't used and
is destined for removal in April 2019 according to DEPRECATE.md.

Closes #3442
We use "conn" everywhere to be a pointer to the connection.

Introduces two functions that "attaches" and "detaches" the connection
to and from the transfer.

Going forward, we should favour using "data->conn" (since a transfer
always only has a single connection or none at all) to "conn->data"
(since a connection can have none, one or many transfers associated with
it and updating conn->data to be correct is error prone and a frequent
reason for internal issues).

Side-effect: this also removes some old pipelining logic that isn't used and
is destined for removal in April 2019 according to DEPRECATE.md.

Closes #3442
@bagder
Copy link
Member Author

bagder commented Jan 10, 2019

Fixed the merge conflicts #3448 caused.

@bagder bagder closed this in ba24323 Jan 11, 2019
@bagder bagder deleted the bagder/conn-data-cleanup branch January 13, 2019 22:11
jay added a commit to jay/curl that referenced this pull request Feb 21, 2019
- Split off connection shutdown procedure from Curl_disconnect into new
  function conn_shutdown.

- Change the shutdown procedure to close the sockets before
  disassociating the transfer.

Prior to this change the sockets were closed after disassociating the
transfer so SOCKETFUNCTION wasn't called since they were closed. That
likely came about from recent work started in Jan 2019 (curl#3442) to
separate transfers from connections.

Bug: https://curl.haxx.se/mail/lib-2019-02/0101.html
Reported-by: Pavel Löbl

Closes curl#3597
Closes #xxxx
jay added a commit to jay/curl that referenced this pull request Feb 21, 2019
- Split off connection shutdown procedure from Curl_disconnect into new
  function conn_shutdown.

- Change the shutdown procedure to close the sockets before
  disassociating the transfer.

Prior to this change the sockets were closed after disassociating the
transfer so SOCKETFUNCTION wasn't called since they were closed. That
likely came about from recent work started in Jan 2019 (curl#3442) to
separate transfers from connections.

Bug: https://curl.haxx.se/mail/lib-2019-02/0101.html
Reported-by: Pavel Löbl

Closes curl#3597
Closes #xxxx
jay added a commit that referenced this pull request Feb 22, 2019
- Split off connection shutdown procedure from Curl_disconnect into new
  function conn_shutdown.

- Change the shutdown procedure to close the sockets before
  disassociating the transfer.

Prior to this change the sockets were closed after disassociating the
transfer so SOCKETFUNCTION wasn't called since the transfer was already
disassociated. That likely came about from recent work started in
Jan 2019 (#3442) to separate transfers from connections.

Bug: https://curl.haxx.se/mail/lib-2019-02/0101.html
Reported-by: Pavel Löbl

Closes #3597
Closes #3598
@lock lock bot locked as resolved and limited conversation to collaborators Apr 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants