August 15, 2009

It's All In the Cache

Twitter is excellent for short messages, but when it comes to debugging a tech problem - 140 characters becomes difficult if not impossible to use.

So we come to a tweet by David Pogue. 

I'm putting the rest below the fold as it will get long.



Reader asks: "How come, after waiting 30 sec for a Web page to load, I'll click Go again, and this time it loads INSTANTLY?" Anyone?


This is somewhat better than tweeting - "my computer won't turn on, does anyone know why", but not much. I'm not being flippant, what I'm saying is, other than the loading issue, there is no real information here.

My first guess was a "browser cache" issue. (BTW - cache in this case is pronounced cash)

All browsers come configured to "cache" web pages. This was started in the days of modems, very low bandwidth availability, AND static web pages.  If you went to a web page once, the browser would keep a copy of that page in a special folder on your system.  Then the next time you wanted to look at the page, it did not have to go out onto the web and pull the entire page down again, it could "instantly" display the page from the cache. 

Blogs and shopping pages were the downfall of the browser cache.  Yet caching is still set up by default on most web browsers (I turn mine off - I've been burned before when the cache kicked in without checking for updated pages - thus had me thinking for several days that some of my blog friends hadn't been blogging - sheesh!).

The tweet above has the ambiguous "when I hit Go again"... what does that mean?  Does it mean you hit "reload"?  That is the only way to override a cached page.  Does it mean you put your cursor in the url window and hit return?  That would once again pull the cached page up and it would load quickly.

But let's say the "Go" button is the reload button.  Then you very likely have a DNS problem.

Because it's inefficient to keep going out to the DNS servers to get IP addresses of places you visit frequently, all computers have their very own DNS cache.  They keep a list of all the web pages you visit.  Most web pages do not change their urls often.  So if you visit a page, the IP address, all those dotted numbers are stored in a little table on your computer.  Think of it like this:

www.yahoo.com = 69.147.76.15

When you click a link, the first place your browser looks is in the various caches.

Does it have a cached page? 
Yes - use that.  
No, then check the DNS cache on the system.

Is there a cached DNS entry? 
Yes - use that.
No - go out to your DNS server and find the address and place it in the cache.

Now, if you are using your ISP's DNS servers, which most people do automatically because that's how they are set up, you come to the next place where things can break down.

You always have 2 DNS servers listed, the primary then then backup.  If the primary DNS server is down or is having issues, the query is supposed to go to the second DNS server.  However, there is a period of wait and see time to give the DNS server time to answer.  This can be anywhere from 30 seconds to 2 minutes.  Call it a: time out period. Once that time has passed, the query is routed to the backup server where (one hopes) you can get an answer. 

When the computer receives that answer - it is placed in the cache and now every time you hit that page again it will pop up immediately.  The cached entry is generally good for about 24 hours maybe 48 depending on what the cache settings are.

Now, if you don't get the page at all, it could be something like - they had a denial of service attack and had to change their IP address to escape it. This takes some time to propagate through the system of DNS servers and if their old IP address is cached on your system, you now have a double whammy keeping you from picking up the web site.

Anyway... to check if dns is the culprit from your home computer, you can flush your dns cache - instructions for each type of operating system are located on this excellent web page.   Then if you try to hit the link again and the load is sllloooowwww.  The culprit is almost certainly dns cache. 

Some businesses have their own dns cache set up to save on outbound bandwidth.  Therefore, if that dns cache is incorrect - it too can cause headaches.  (but that's another story).

Of course, if I can get back to the original point, often times ISP's can have DNS problems.  Failures, changes, all kinds of things you have no control over.

Also, many ISP DNS servers are insecure and subject to a problem called cache poisoning, a hack I talked about a while back when I switched over to Open DNS on my systems.

After ALL THAT we could still be looking at yet another problem because information is at a minimum.  And I didn't even go into all the other DNS stuff. 

If a blog post is not capable of containing sufficient information then twitter isn't going to cut it.  The real problem needs to be thrashed out where more and better words can be exchanged. 

Is that clear as mud? Good.

Posted by: Teresa in WebTech at 06:01 PM | Comments (5) | Add Comment
Post contains 947 words, total size 6 kb.

1 140 characters isn't a lot to describe the question. But here's the longer version: Frequently--regardless of browser or OS--I'll request a URL. The browser connects, the screen goes blank, and then just sits there. 10 seconds. 20 seconds. Still no text or graphics. Impatient, I click Go or hit Enter a second time. Suddenly, the page appears. The cache isn't the issue, I believe, since the page has never been visited before. Hope this helps! --DP

Posted by: david Pogue at August 16, 2009 11:15 AM (bFbqk)

2 p.s.—your Comments box nuked all of my paragraph breaks!

Posted by: david Pogue at August 16, 2009 11:16 AM (bFbqk)

3 There could also be an intermediary cache server that is retrieving the page, but not passing it along the first time (network connectivity, bad routing, etc.) but in a round-robin type situation, the next request hits the cache server immediately?

Posted by: Craig at August 16, 2009 01:06 PM (Rc4D4)

4 Very true Craig. 

It sounds very similar to an irritating problem I have had in the past with routing issues on my Windows box.  If I had the vpn up (which was most of the time) a browser request would be shunted to the vpn IP first then time out before ending up on the correct route. 

I have that fixed now, but it's been a couple years and I forget what route I added to fix it.

The problem did show up immediately on a tracert so I'm wondering if this one could be captured on tracert too. 

And as you say, it could be an intermediary cache that is causing the problem.  This is actually a very sticky problem that might take some digging to fix. 

Posted by: Teresa at August 16, 2009 01:18 PM (epSz+)

5 If only people would call Take Supporb.

Posted by: dogette at August 22, 2009 10:01 AM (1W6eH)

Hide Comments | Add Comment






27kb generated in 0.0434 seconds; 71 queries returned 219 records.
Powered by Minx 1.1.4-pink.