Loading...
Real Time Concepts

Nginx map Directive

What is it?

The nginx map module allows you to map a set of values into a different set of values, storing the result in a variable. The nginx map directive is simple to understand and generally formatted as a table for readability. It takes an input string often composed of arbitrary text and nginx internal variables and outputs another string as a variable. It is possible to create a map with no variables in its input but this would not serve any useful purpose. It does support full PCRE syntax, as will be detailed. The structure and use of a map directive allows you to define the rules for a decision in one place and then later invoke those rules to make a decision – such as where to redirect a request, which FastCGI backend to use, or even whether to allow a request at all!

How do I use it?

One of the key facts about the map{} directive is that it must be defined in the global http{} block rather than in a given server{} block. Because of this, I often place the files in the conf.d or equivalent folder, named with the format “map-mapname.conf” to make it easy to tell what will be loaded in the file. Each file contains exactly one map that should be loaded and no conflicting map names are allowed. This is not enforced, but forms a best practice I have developed in working on configurations with these. To invoke a map, you simple refer to it by the full name including the $. Such as with example three, I use “fastcgi_pass $wprouter;” to invoke it and use its result as a decision as to which FastCGI instance is used.

Examples

Example 1

map $http_host $name {
hostnames;
default 0;
example.com 1;
*.example.com 1;
test.com 2;
*.test.com 2;
.site.com 3;
}

This map is a relatively simple example, a definition of the map named “$name” with an input of the internal nginx variable “$http_host” (ie, the contents of the Host: header). It does the following:

  • Turns on “hostnames”, aka a parsing method specific to hostnames that bypasses many of the normal PCRE rules to make better use of the knowledge that it will contain hostnames and match them intelligently
  • Sets a default of “0” as an output if nothing else matches
  • Matches wildcard aware forms of example.com and test.com with outputs of 1 and 2 specifically
  • Matches only subdomains of sites.com with an output of 3

At first glance this seems pretty useless, but say you’re building a large scale bulk hosting environment where each user has their own php-fpm and you need to build a request router to determine what server gets the request. Now I wouldn’t personally name my upstreams 0, 1, 2, etc but you could. This could serve to select that based on hostname within one common server{} block, where each php-fpm instance is chrooted into a different location and thus has a different view of the filesystem (so the same filename is actually different content). This would be best for a very specific application hosting rather than generic sites, but serves as an easily readable example and highlights the use of the ‘hostnames’ option.

Example 2

map $uri $new {
default http://www.domain.com/home/;
/aa http://aa.domain.com/;
/bb http://bb.domain.com/;
/john http://my.domain.com/users/john/;
}

This map is also quite simple but can serve to build a high efficiency, generic redirect router similar to what some companies with DNS service offer as an add-on for ‘redirecting’ domains to another domain without your own server. The default is to go to one URL, with definitions for other subdirectories to go to specific other URLs.

Example 3

map “METHOD:$request_method COOKIE:<$http_COOKIE> TYPE:$content_type URL:$request_uri.” $wprouter {
default ro-php;
~METHOD:POST rw-php;
~METHOD:PUT rw-php;
~COOKIE:<.*wordpress_(?!test_cookie).*> rw-php;
~TYPE:multipart/form-data rw-php;
~URL:.*wp-admin.* rw-php;
~URL:.*wp-login.* rw-php;
}

This example forms the backbone of the nginx/php-fpm multiserver WordPress without Varnish configuration I created. It is the one that gets very interesting as it uses some advanced features. Starting with the simple parts, this outputs a string intended to be used in the choice of which upstream FastCGI server nginx sends the traffic to. The names are descriptive, but to make it obvious: rw-php runs as a user with read-write privileges and in a multiserver configuration only on the master (connected to via TCP from the master or slaves) while ro-php runs as a user with
read-only privileges and is always reached via local file socket whether on a master or slave. The upside is that this configuration is identical no matter which device you run it on! On to the more advanced parts of this, the string is in quotes and contains multiple variables as well as freeform text to differentiate the variables. As this is doing regular expression matching, order can be important on a security basis. After all, any field that is freeform from a user can be potentially modified if they have knowledge of your structure to cause the request to run with more privileges than
intended. So always think through your regular expressions and what they may unintentionally match when writing a structure such as this to avoid unnecessarily allowing privilege escalation attacks. You don’t lose any security compared to a similar Apache or traditional nginx configuration but you do lose some of the security through the principal of least necessary privileges that this configuration can bring with it if an attacker can cause the regular expression to match on a request that it should not. In a more general case, it may actually cause incorrect behavior depending on what decision you are using the output for.

Moving on to the specific matches and the fields in the input string:

  • ~METHOD:POST: This matches any request with $request_method set to POST. IE, the HTTP verb POST. Simple, and I could combine the second into this one in a more complex regular expression but I prefer to keep it more readable.
  • ~METHOD:PUT: See above for explanation. WordPress may not use this but it is included for completeness. If I were to combine them it would be ~METHOD:P(OST|UT) or for a more readable form but with more repetition, ~METHOD:(POST|PUT)
  • ~COOKIE:<.*wordpress_(?!test_cookie).*>: This one is probably the most complex. I protect the cookie string with <>s it may contain spaces and is the most obvious field for a user to tamper with. It compares with the $http_COOKIE nginx variable containing the contents of the Cookie: header in a request and looks for any instance of wordpress_ NOT followed by test_cookie. This is necessary because WordPress names its cookies with a random hash string, but prefixes them all with wordpress_. The exception is wordpress_test_cookie which is not deleted when logging out of the admin panel so can cause incorrect request routing after logging out if it is not excluded. As a note, since I don’t check that other items are not themselves inside <>s, I haven’t fully protected this map against injection to escalate privileges as I’ll explain further later.
  • ~TYPE:multipart/form-data: This matches against the $content_type nginx variable that contains the Content-Type: header for the specific request. I included this to catch instances caught by the similar check in our Varnish VCL but it may not be entirely necessary for this configuration. As a note, a custom crafted HTTP request could contain arbitrary information on the Content-Type field so optimally this would be protected and some form of test done before to ensure that the TYPE is sane. A map could be used for that as well but I have not created such as it is unnecessary for the operation of WordPress.
  • ~URL:.*wp-login.*: This matches any URL containing the string wp-login. I could combine the below match in with this but again I like to keep them more simple and readable. Any wp-login load will go to the master (not entirely necessary, but useful for paranoia and safety).
  • ~URL:.*wp-admin.*: This likewise matches any URL containing the string wp-admin. If I were to combine these it would be wp-(login|admin) instead.
  • If none of the above match, direct it to the local (Read-only) PHP-FPM instance instead.

I think this illustrates the real power of map{} as a decision making interface without the use of long RewriteCond/if statement chains very well. It is also very readable as you can just glance down the full ‘table’ and see what is sent where.

References

Credits

  • Wolf for inspiring me to look into nginx in general and answering numerous questions as I developed the $wprouter map
  • Matiu for taking the ideas I postulated and running with them to create a Varnishless multiserver Magento configuration
  • All of MC 2nd for allowing me the time to experiment and create this configuration!
Leave a Reply

Your email address will not be published. Required fields are marked *