January 16, 2009

Introduction to Web-Caching

The revolution in Internet and World Wide Web began with the introduction of Server Side web development platform. This new platform made the web dynamic. The web pages can now be generated on-the-fly based upon request parameters.

The above said is very well supported with the success of Content Management Systems like Wordpress, Drupal and Joomla.

But the dynamic generation of content made the sites with large user base slow, because generating a single web page requires a number of database transactions like retrieving site settings, current user details, etc. The content was dynamically generated on every user request. This increased load on web-servers and created a lag in the fulfilment of user requests.

When web developers started exploring the solution for the above problem they found that the web pages do not change often and regenerating the same unchanged web page was sheer wastage of web-server resources. So, they came up with the concept of Web-Cache.

Web cache stores copies of documents passing through it and subsequent requests can be satisfied from the cache until the cache time expires and after time expires the page is regenerated. Hence, Web-Cache was able to reduce the load on server.

The following algorithm will provide a deeper insight on working of a simple Web-Cache:

for a requested URL, try finding that page in Web-Cache
if the page is present in the Web-Cache:
return the cached page (i.e. do not regenerate the page)
else:
generate the page
save the generated page in the Web-Cache
return the generated page

The above algorithm can be used to implement a basic Web-Cache. But, to make caching more effective and robust, more complex algorithms are developed.

In this series of articlowers we will discuss how popular Content Management Systems and frameworks implement Web-Caching. So, stay tuned for more.

1 comment:

  1. Quote: "for a requested URL, try finding that page in Web-Cache"

    But the point is that the pages ARE dynamically generated, which means that checking whether the required (dynamic) page is present in the cache or not means first generating it and comparing it against the cache and checking if they are the same. But since we're already generating the page, it defeats the purpose.

    Correct me if I'm missing something.

    Cheers
    Shashank

    ReplyDelete