PHP Regular Expressions Escaping Special Characters in a Regular Expression - Web Development and Design | Tutorial for Java, PHP, HTML, Javascript PHP Regular Expressions Escaping Special Characters in a Regular Expression - Web Development and Design | Tutorial for Java, PHP, HTML, Javascript

Breaking

Post Top Ad

Post Top Ad

Monday, July 8, 2019

PHP Regular Expressions Escaping Special Characters in a Regular Expression

PHP Regular Expressions


Escaping Special Characters in a Regular Expression

Problem

You want to have characters such as * or + treated as literals, not as metacharacters, inside a regular expression. This is useful when allowing users to type in search strings you want to use inside a regular expression.

Solution

Use preg_quote() to escape PCRE metacharacters: 

       $pattern = preg_quote('The Education of H*Y*M*A*N K*A*P*L*A*N').':(\d+)';
       if (preg_match("/$pattern/",$book_rank,$matches)) {
            print "Leo Rosten's book ranked: ".$matches[1];
       }

Discussion

Here are the characters that preg_quote() escapes:

       . \ + * ? ^ $ [ ] () { } < > = ! | :

It escapes the metacharacters with a backslash.

You can also pass preg_quote() an additional character to escape as a second argument. It’s useful to pass your pattern delimiter (usually /) as this argument so it also gets escaped. This is important if you incorporate user input into a regular expression pattern. The following code expects $_GET['search_term'] from a web form and searches for words beginning with $_GET['search_term'] in a string $s:

       $search_term = preg_quote($_GET['search_term'],'/');
       if (preg_match("/\b$search_term/i",$s)) {
            print 'match!';
       }

Using preg_quote() ensures the regular expression is interpreted properly if, for example, a Magnum, P.I. fan enters t.c as a search term. Without preg_quote(), this matches tic, tucker, and any other words whose first letter is t and third letter is c.

Passing the pattern delimiter to preg_quote() as well makes sure that user input with forward slashes in it, such as CP/M, is also handled correctly.

No comments:

Post a Comment

Post Top Ad