When reading through the book “High Performance Web Sites” of Steve Souders, he states that fast response times are a major concern when developing web sites but enhanced user experience also. Images can make the page load slower. Therefore it is important to maximize the browser’s caching capabilities for these resources. This can be achieved by applying Steve Souders 3th rule “Add Expires Header”.
A first-time visitor to your page may have to make several HTTP requests, but by using a future Expires header, components are cached in the browser for a duration indicated by the Expires header. The Expires header eliminates the need to check with the server by making it clear whether the browser can use its cached copy of a component. The Expires header is sent by the server. When the browser sees an Expires header in the response, it saves the expiration date with the component in its cache. As long as the component hasn’t expired, the browser reads the components straight from disk without generating an HTTP request.
The Expires header was implemented in HTTP/1.0. If the component is served from cache, you can find the following details at the bottom of the Headers tab.
Adding an Expires header to your components limits the number of HTTP requests and decreases the size of the HTTP responses on subsequent page views.
There is a small disadvantage: because the Expires header uses a specific date, the expiration dates have to be constantly checked, and when that future date finally arrives, a new date must be provided by the server.
A future Expires header is most often used with images, but it should be used on all components, including scripts, stylesheets, and video. Adding a future Expires header incurs some additional development costs.
To analyze which of your components are good candidates to be cached, you can look at the Last-Modified dates in the HTTP headers. If the Last-Modified date of a resource lies far in the past, you can better make the resource cacheable.
GET and conditional GET
If the browser is setup to cache resources, it will store the downloaded resource in cache, together with the headers that have been sent by the server. On a subsequent request, the resource will be used from cache if the HTTP headers indicate that the resource is still valid. A resource is still valid if an Expires header is available and if its date lays in the future. In that case no roundtrip to the server will occur.
In most cases a far future Expires header is set. If a component does not have a far future Expires header, it’s still stored in the browser cache. On subsequent requests the browser checks the cache and finds that the component is expired (in HTTP terms it is “stale”). For efficiency, the browser sends a conditional GET request to the origin server. If the component hasn’t changed, the origin server avoids sending back the entire component and instead sends back a few headers tellingxagthe browser to use the component in its cache.
Although the conditional GET request is faster than when the requested component isn’t cached, it is still less performing than with a far future Expires header because there was an additional HTTP request.
If no Expires header is available or if its date lies in the past, a request will be sent to the server. If the resource is not valid anymore, the server will return status code 200 OK and will send the resource to the browser. If the resource is still valid, the server will send a status code 304 Not Modified without the resource. This tells the browser that it can use the resource from its cache.
The Expires header has two major disadvantages:
- The time is in GMT. If web server and browser cache use a different time zone, your caching mechanism could not behave like intened. This issue is solved with the Cache-Control header.
- If a far future date is set and the resource changes in the meantime, the change is not detected as the resource will be served from cache till the expiration date. This can be solved by revving the filename of the resource.
The Cache-Control header
The Cache-Control header has been implemented in HTTP/1.1. It looks like for example:
Cache-Control: public, max-age=86400
Alternatively, Cache-Control uses a number of additional parameters that you can set:
- public: in general authenticated resources are not cacheable. By declaring the resource as public, the resource can be cached by the browser, but also by proxy caches.
- private: the resource may not be cached by proxy caches, only browsers are allowed to cache the resource.
- no-store: instructs the browser cache not to store the resource in cache but to request it each time from the server.
- no-cache: forces the browser to request the resource each time from server.
- must-revalidate: HTTP allows caches to serve stale resources under special conditions. By specifying must-revalidate you force the browser to strictly follow your validation rules.
- proxy-revalidate: this is similar to must-revalidate, except that it only applies to proxy caches.
- max-age=[seconds]: this setting specifies the maximum amount of time that a resource can be served from cache. The behaviour is similar to the Expires header, although the value indicates a relative amount of seconds as from the date/time it was received from the server. If less than max-age seconds have passed since the component was requested, the browser will use the cached version, thus avoiding an additional HTTP request. A far future max-age header might set the freshness window 10 years in the future.
- s-maxage=[seconds]: this setting is similar to max-age, except that it only applies to proxy caches.
You still might want an Expires header for browsers that don’t support HTTP/1.1 (even though this is probably less than 1% of your traffic). You could specify both response headers, Expires and Cache-Control max-age. If both are present, the HTTP specification dictates that the max-age directive will override the Expires header.
Last-Modified and If-Modified-Since headers
The Last-Modified date is a HTTP response header returned by the server. It indicates when the resource has been modified for the last time. The browser caches the component along with its Last-Modified date. The next time the resource is requested – and no Expires header or Cache-Control header is available – the browser uses the If-Modified-Since header to send its HTTP request to the server. The value is set to the last-modified date.
If the Last-Modified date on the server still matches the If-Modified-Since date sent by the browser, a 304 Not Modified status code is returned and the component is not sent over to the browser. If the dates don’t match, a 200 OK response is sent, along with the component.
Whether a resource is considered as valid or not, depends on the other request headers. If the resource was sent over with a Last-Modified date, the browser will send a conditional GET with this date specified in the If-Modified-Since header. If the resource was not modified since the specified date, the server will send a 304 Not Modified to inform the browser that the resource is still valid and may be used from cache. If the resource has modified since then, the server will send a 200 OK with the new version of the resource.
Also the availability of an ETag invokes a conditional GET. If the value of the ETag hasn’t changed, the server will send a 304 Not Modified, otherwise it will send a 200 OK with the new version of the resource included.
When using Firebug, the timeline clearly shows which resource is downloaded from the server and which resource is used from cache: all lines are conditional GETs. The black lines represent the resources that are downloaded from the server and the gray lines represent the resources used from cache. The resources that are used immediately from cache (because of an Expires header with a date in the future) are not represented on the timeline.
Based on the previous images, we can make the following observations:
- Varying numbers of HTTP requests occur in parallel. The number of HTTP requests that can happen in parallel vary due to the number of different host names being used and whether they use HTTTP/1.0 or HTTP/1.
So, to cut a long story short:
If a resource has an Expires header with a date in the future, the resource is used from the browser cache. If the Expires header has a date in the past or when there is no Expires or Cache-Control header, a request is sent to the server with all headers available: ETag, If-Modified-Since, etc. If the server evaluates the resource as still being valid, based on the HTTP request headers, a 304 Not Modified status code is returned and the browser will use the resource from its cache. If the server evaluates the resource as obsolete, a 200 OK status is returned together with a newer version of the resource. The new resource will be used on the page and stored in the browser cache, together with its HTTP response headers.
How does this relate to SharePoint?
When developing for SharePoint, most of the developers pay a lot of attention at the way they write their server-side code but in most cases they don’t have a good understanding of how it can impact the traffic on the web. Each type of SharePoint caching has its own impact on the HTTP response headers when a SharePoint page is requested.
When the BLOB cache is disabled, the Cache-Control header is set to private,max-age=0.
When the BLOB cache is enabled, the Cache-Control header is set to public, max-age=86400. The value of the max-age setting is defined by the max-age attribute in the BLOBCache element in the web.config. If this attribute is not specified, the value defaults to 86400 seconds, which is 24 hours.
When the file is stored locally on the client by the browser, it sets the Expires date based on the Cache-Control header.
In short, when BLOB cache is disabled, the file is not cached in a folder on the web server, nor locally on the client. This means a roundtrip to the database on each request. When the BLOB cache is enabled, the file is cached in a folder of each web server and also locally on the client. If you only want caching on the web servers and no caching on the client, you have to set the max-age attribute in the web.config to 0.
For more details I refer to my article on BLOB caching.
The output cache has an impact on the first request where the HTML is requested. If the output cache is disabled, the Cache-Control header is set to private,max-age=0.
If the output cache is enabled, you will see in the HTTP response headers that the Cache-Control header is set to public, max-age=value. The value of the max-age setting is defined by the Duration setting in the output cache profile.
In short, when output cache is disabled, the HTML is not cached in memory on the web server, nor locally on the client. This means that the page needs to be reassembled with according roundtrips to the database on each request. When the output cache is enabled, the HTML is cached in memory of each web server and also locally on the client.
For more details I refer to my article on output caching.
- Book “High Performance Web Site” from Steve Souders
- HTTP headers and caching