Caching and Time Perception in Web Page Loads

When a visitor hits a site, the site has 250ms to load the page.  

Under 250ms, the page is golden.  The site feels snappy.  The browser runs smooth.  The site is healthy.  

Between 250ms - 1s, the site feels loaded.   Not overly so.  Users will not browse away in under a second.  It's not fantastic but it is workable.  

Between 1-4s, the site has user attrition.  Some will stay but a proportion of users will browse away, close the tab, complain the site is too slow, or otherwise leave.  The proportion increases as the site load approaches 4s.

After 4s, the user attrition rate climbs to 100%.  No one waits 4s for a page to load.

Get out a digital timer and try it.  Open up random pages and see how long it takes to load a page and the satisfaction of seeing it load.   Under 250ms, the page loads "immediately."  Over 4s the page is "sluggish, slow and dated."  

Everything in the path for loading a page causes latency on the page loads:

  • Loading a local asset from disk;
  • Executing complex logic;
  • Making a REST call;
  • Running a SELECT (any SELECT) on a database.

Anything that touches the disk causes latency.  Anything that touches the network causes latency.  Anything that causes code to execute causes latency.  In fact, anything that even touches the site causes latency.

The trick is this:

  • To load as much of the site into memory as possible;
  • To do as little work as possible.

So we cache in memory because memory is fast and cheap and disk is slow.  What do we cache?

Everything.

To whit:

  • Sessions (although project butterfly does not yet have sessions);
  • Results of SELECT queries;
  • Entire REST calls to reduce round-trip times;
  • Easy lookup data;
  • Assets (images, javascript and CSS);
  • Entire flat static web pages;
  • Anything that looks even remotely cacheable.

Taking ready-made data out of cache is faster than computing it, looking for it, going to disk for it, or asking for it.  In most production systems, 80% of its work is data reads and 20% generating data.  Of that, ~99% of those reads are the same reads repeated ad nauseum.  We take the pre-generated answer and throw it into a Key-Value cache and pull it on request.  Why generate the data when the data is already there?

Big sites means huge caches.  We can build caching into the system at the:

  • Web page/asset level
  • The web front end logic level
  • The backend application logic before the database level
  • The asynchronous backend worker level

Some of these can be shared.  

Anyway, this is all basic stuff.  There are two standard caching systems -- memcached and redis -- worth discussing.  These are both widely supported, with many available libraries in all languages.  There are some secondary ones like EHCache and Riak which are a bit more specialized.  

Datasets beyond what a memcached or redis can handle goes quickly into the memory-only/NoSQL database discussion.  But next, need to pick a standard to use for the site based on what will work best for the site.