This post covers the PCRE section of the Strings chapter when studying for the Zend PHP 7 Certification.
The PCRE (Perl Compatible Regular Expression) library is one that can contains functions that implement regular expression patterns.
When using the PCRE functions, it is required that the pattern is enclosed by delimiters.
Often used delimiters are forward slashes, /
, hash signs, #
and tildes, ~
. The following are all examples of valid delimited patterns.
/foo bar/
#^[^0-9]$#
+php+
%[a-zA-Z0-9_-]%
These delimiters can be seen within the preg_match()
function, which performs a regular expression match. The function can take up to 5 parameters.
<?php
preg_match("/PHP/", "PHP is the web scripting language of choice.", $matches);
print_r($matches);
// Outputs:
Array ( [0] => PHP )
The function can also be used in conditional statements. The function returns 1
if the pattern matches given subject, 0
if it does not, or false
if an error occurred.
<?php
if (preg_match("/PHP/", "PHP is the web scripting language of choice.")) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
// Outputs:
A match was found.
Modifiers can be used at the end of regex patterns. Some of the modifiers used can be seen below.
<?php
// The "i" after the pattern delimiter indicates a case-insensitive search
if (preg_match("/php/i", "PHP is the web scripting language of choice.")) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
// Outputs:
A match was found.
Anchors are used to determine the engine to look in a specific string location: for instance, the beginning of the string, or the end of a line.
The anchor character used for the beginning of the string is ^
, and the anchor used for the end of a string is $
.
<?php
if (preg_match("/^P/", "PHP is the web scripting language of choice.")) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
// Outputs:
A match was found.
The word boundary \b
matches positions where one side is a word character (usually a letter, digit or underscore—but see below for variations across engines) and the other side is not a word character (for instance, it may be the beginning of the line or a space character).
<?php
if (preg_match("/\bcat\b/", "black cat")) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
// Outputs:
A match was found.
The regex \bcat\b
would therefore match the word cat
on its own i.e. in Black cat
, but not words containing cat
such as catastrophe
.
<?php
if (preg_match("/\bcat\b/", "catastrophe")) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
// Outputs:
A match was not found.
The regex \bcat
would check for the word cat
and return true if the line started with those characters.
<?php
if (preg_match("/\bcat/", "catastrophe")) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
Similarly, The regex cat\b
would check for the word cat
and return true
if the line ended with those characters.
<?php
if (preg_match("/cat\b/", "bobcat")) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
This preg_match_all()
function is similar to preg_match()
, except it searches the subject for all matches
to the regular expression given in pattern
and puts them in matches
in the order specified by flags
.
The preg_replace()
function performs a regular expression search and replace. It takes three parameters.
<?php
$copy_date = "Copyright 1999";
$copy_date = preg_replace("([0-9]+)", "2017", $copy_date);
echo $copy_date; // Outputs: Copyright 2017
View the other sections:
Note: This article is based on PHP version 7.1.