Technical Details
The first thing we need to do, is configure Nginx to determine what needs to differentiate our cached pages. In our example, we’re using device type since this customer’s web-app serves different content to iPad users, different content again to iPod and iPhone users, different to generic Tablet users, different to other Smart Phone users, and different to Desktop users. Customer actually uses a PHP library with literally hundreds of different regex’es to determine device type. However, we didn’t really want to duplicate all of that in Nginx config, so we picked the low hanging fruit. In a test against their logged User-Agent strings, the code below only missed approx 5,000 out of 1,000,000. Since those 5,000 would be passed through to the PHP back-end for proper software determination anyway, it’s irrelevant if we miss a very small percentage like that.
/etc/nginx/conf.d/devicetype.conf
# we are going to set a variable named $device_type based on User-Agent string
# we will use this var later in the processing to determine if this is A) cacheable
# content, and B) what type of device it is, so that each device type can get it’s
# own content, and we can still cache that content to alleviate server load.
# NOTE: the below is a 1st-match-only regex block. Once we match ‘iPad’, for instance,
# we never even try to match any of the others.
# Also note: ~* is case INsensitive match, ~ is case sensitive match.
map $http_user_agent $device_type {
default ‘none’;
# bots (these maybe can be done as ‘desktop’ ?)
~*bot|spider|Ezooms|crawl ‘bot’;
# let’s separate out iPad/iPod/iPhone from other devices, since we know we serve special content
# for devices running iOS, and separate content for large devices (iPad) vs small ones (iPhone|iPod).
~iPad ‘ipad’;
~iPod|iPhone ‘iphone’;
# the word “Tablet” below should catch a huge chunk of tablets (including IE running tablets)
# then we’ll explicitly match some of the common ones
~*Tablet|Kindle|Silk|Nook|PlayBook|Transformer ‘tablet’;
# String “Phone” catches, Windows Phone, smartphone, and no doubt dozens of others we aren’t aware of.
# and our confidence level is high that nobody will be sending ‘Phone’ in their browser string
# if their device is really say, a tablet or desktop.
# then we hit a few of the commoner / least likely to be mistaken ones out there
# we can also assume all non-PlayBook (see above) BlackBerry’s are phones
~*Phone|IEMobile|Zune|Palm|BlackBerry ‘phone’;
# we flag MSIE on Windows or Mac, as well as MAC OS X (OSX does not run on tablets or phones)
# windows NT seems to catch FireFox, Chrome, and IE in most cases – and no OS reporting itself
# as Windows NT should ever be a mobile device. Some old browsers also report as Win98
~*MSIE.*Windows|MSIE.*Mac|OS\ X|Windows\ NT|Windows\ 98|X11.*Linux|X11.*CrOS ‘desktop’;
}
# Disable caching if we didn’t match a specific type of device/content above.
# doing this here as a ‘map’ statement instead of later as an ‘if’ statement
# see here http://wiki.nginx.org/IfIsEvil for why we made this change
map $device_type $no_cache {
default ‘0’;
none ‘1’;
}
Next, we need to configure our cache location. In this case, we put the cache in /dev/shm/microcache (on a Ubuntu box) for performance reasons. If you care about the cache surviving a reboot, you probably want to locate it elsewhere. It is also worth noting, that Nginx will create the cache path if it does not exist, so it’s safe to put this in locations that get cleared at each reboot (such as shared memory or /tmp), without having to modify startup scripts to account for it.
Again, this needs to be prior to the “server {}” block, so we put it in /etc/nginx/conf.d/microcache.conf
/etc/nginx/conf.d/microcache.conf
fastcgi_cache_path /dev/shm/microcache levels=1:2 keys_zone=microcache:5M max_size=1G inactive=2h;
Now, all we have left is to configure the “server {}” specific Nginx options for the “location” in question. You will no doubt want to edit the example below to suit your uses. In this case, customer only wanted to cache the banner ad traffic, which was restricted to /adds/video/ URI’s.
/etc/nginx/sites-enabled/adexcite.com
# this is the micro-caching setup
# see also /etc/nginx/conf.d/microcache.conf
# see also /etc/nginx/conf.d/devicetype.conf
location ~ ^/ads/video/.*\.php {
try_files $uri =404;
fastcgi_cache microcache;
# Bypass cache if flag is set, from devicetype.conf
fastcgi_no_cache $no_cache;
fastcgi_cache_bypass $no_cache;
# use $device_type as part of the cache key, from devicetype.conf
fastcgi_cache_key $server_name|$device_type|$request_uri;
fastcgi_cache_valid any 15m;
fastcgi_pass unix:/var/run/php5-fpm/adexcite.com.socket;
fastcgi_index index.php;
include /etc/nginx/fastcgi_params;
}
# EOF microcaching section
So, as you can probably see from the above, what happens now is the following logic flow:
1) Browser requests web page
2) Nginx determines it’s device type, and notes it in memory.
4) If it cannot determine device type acceptably, it sets the $no_cache flag to force the webserver to pass the request on to the PHP code, which will handle the request (hopefully) intelligently.
3) When Nginx looks up the URI in cache, it looks for a URI cached for the specific device type of the current request.
4) From there on, it works like any other cache.
Debugging techniques (how to figure out why your first attempt didn’t work)
You can easily use curl to imitate almost anything as far as browser-side criteria. For the example above, I used the following to set fake User-Agent string to test if I was caching (or not caching) correctly:
curl -A “iPad” http://domain.com/path/to/script.php?eid=129484
How to quickly see both if you have cached objects, and what cache_key they were stored under:
head `find /dev/shm/microcache/ -type f` | strings | grep KEY
# example output from above
KEY: adexcite.com|iphone|/ads/video/controller.php?eid=10431
KEY: adexcite.com|desktop|/ads/video/controller.php?eid=10531
etc.