curl / Mailing Lists / curl-users / Single Mail

curl-users

Re: 10 second stalls when retrieving pages from washingtonpost.com

From: Ray Satiro <raysatiro_at_yahoo.com>
Date: Sat, 17 Nov 2018 16:56:05 -0500

On 11/17/2018 6:04 AM, Daniel Stenberg wrote:
> On Fri, 9 Nov 2018, John Brayton wrote:
>
>> I am trying to retrieve pages from washingtonpost.com using curl and
>> am seeing 8-10 second stalls. For example, this command takes between
>> 8 and 10 seconds:
>>
>> Any suggestions would be very helpful to me. Thank you.
>
> There's no doubt that this is the server's choice or flaw.
>
> If you add -v and --trace-time to the command line you can see clearly
> that the server is taking a long time to respond in these cases.
>
> I also tried with different curl versions and TLS backends and I
> couldn't avoid the stall.
>
> Curious! Time to fire up wireshark and see if that can teach us
> something here...

It's AkamaiGhost which I'm sure is caching the pages based on certain
header fields (usually user agent). Unfortunately it doesn't respond to
debug requests so I can't say for sure. Nothing can be done about this
except send some headers that already match a cached copy. I suggest the
ESR version of Firefox agent, since I think that would be most likely to
match over the longest period of time.

--compressed --user-agent "Mozilla/5.0 (Windows NT 6.1; rv:60.0)
Gecko/20100101 Firefox/60.0"

If you do that you may receive compressed content though which is why
--compressed so it's automatically decompressed. Also here is an old ESR
version that seems to work just as well: "Mozilla/5.0 (Windows NT 6.1;
rv:38.0) Gecko/20100101 Firefox/38.0"

-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2018-11-17