2.12.11. Regular expressions

It is convenient to test and debug regular expressions using services like regex101.com or regexr.com.

In directives .htaccess you can use regular expressions to customize the rules flexibly. Regular expression example:

RedirectMatch /([^/]*)/([^/]*)/script.php$ http://example.com/index.php?$1=$2
  • . — replaces any one character.
  • ? — indicates the optional use of the previous character (for example, in the construction tes?t will fit like words test so and tet).
  • * — means that the previous character (group of characters) maybe repeat (this symbol indicates an optional repetition, that is, the symbol can occur either 0 times or an infinite number).
  • [abc] — indicates a list of characters that match letters a, b or with.
    • [^abc] — a list of symbols that are not used. That is, any character will do except a, b or with.
    • [abc]* — a list of consecutive characters that will be found in the string.
    • [^abc]* — a list of unacceptable consecutive characters in a string.
  • + — means that the previous character (group of characters) should repeat (as opposed to *, this symbol indicates a mandatory repetition, that is, the symbol can occur at least 1 time).
  • ! — the logical NOT symbol, used to form exclusion rules.
  • | — logical OR symbol, set in groups.
  • () — grouping of structures.
  • {x} — repeating a character several times, where X — the number of repetitions (several options must be set separated by commas).
  • \ — escaping of service characters for their use in templates.
  • .* — specifying any number of characters in any order. /.*/ — in this case, all substrings between slashes will match.
  • ^ — the beginning of a line, used at the beginning of an expression.
  • $ — end of line, indicated at the end of the line.
  • \w — indicates a letter, number, or underscore _.
  • \W — any character other than letters, numbers and underscore.
  • \d — indicates any digit.
  • \D — indicates any character other than numbers.
  • \s — any space character (indentation).
  • \S — any non—whitespace character.

Fetching from a table ASCII:

  • [0-9] — a range of numbers from 0 to 9.
  • [a-z] — a range of letters from a to z, lowercase Latin characters.
  • [A-Z] — a range of letters from A to Z in uppercase.
  • [a-zA-Z] or [a-Z] — a range of letters from a to Z in any case.

For example two URL:

  1. http://example.com/someurl/44/test/regular/expression.php
  2. http://example.com/url/4/another/test/somefile.php

To create a redirect template that will include all requests that start with someurl, you need to specify it like this:

RewriteCond %{REQUEST_URI} ^/someurl/

In this case, both addresses will fall under the rule:

RewriteCond %{REQUEST_URI} ^/.*url/

Using the OR value, you can achieve rules for several options:

RewriteCond %{REQUEST_URI} ^/(someurl|url)/

Using range and repetition, you can achieve rules for enumerating suitable values, the example will print the first address:

RewriteCond %{REQUEST_URI} ^(/+)[0-9]{2}(/+) 

An example like this will print the second address:

RewriteCond %{REQUEST_URI} ^(/+)[0-9](/+) 

To exclude a link from the request using regular, you should use the following rule:

RewriteCond %{REQUEST_URI} !/regular/ 
Content