Source of: /manual/en/regexp.reference.subpatterns.php
<?php
include_once $_SERVER['DOCUMENT_ROOT'] . '/include/shared-manual.inc';
$TOC = array();
$PARENTS = array();
include_once dirname(__FILE__) ."/toc/reference.pcre.pattern.syntax.inc";
$setup = array (
'home' =>
array (
0 => 'index.php',
1 => 'PHP Manual',
),
'head' =>
array (
0 => 'UTF-8',
1 => 'en',
),
'this' =>
array (
0 => 'regexp.reference.subpatterns.php',
1 => 'Subpatterns',
),
'up' =>
array (
0 => 'reference.pcre.pattern.syntax.php',
1 => 'Pattern Syntax',
),
'prev' =>
array (
0 => 'regexp.reference.internal-options.php',
1 => 'Internal option setting',
),
'next' =>
array (
0 => 'regexp.reference.repetition.php',
1 => 'Repetition',
),
);
$setup["toc"] = $TOC;
$setup["parents"] = $PARENTS;
manual_setup($setup);
manual_header();
?>
<div id="regexp.reference.subpatterns" class="section">
<h2 class="title">Subpatterns</h2>
<p class="para">
Subpatterns are delimited by parentheses (round brackets),
which can be nested. Marking part of a pattern as a subpattern
does two things:
</p>
<p class="para">
1. It localizes a set of alternatives. For example, the
pattern
<i>cat(aract|erpillar|)</i>
matches one of the words "cat", "cataract", or "caterpillar".
Without the parentheses, it would match "cataract",
"erpillar" or the empty string.
</p>
<p class="para">
2. It sets up the subpattern as a capturing subpattern (as
defined above). When the whole pattern matches, that portion
of the subject string that matched the subpattern is
passed back to the caller via the <em class="emphasis">ovector</em>
argument of
<b>pcre_exec()</b>. Opening parentheses are counted
from left to right (starting from 1) to obtain the numbers of the
capturing subpatterns.
</p>
<p class="para">
For example, if the string "the red king" is matched against
the pattern
<i>the ((red|white) (king|queen))</i>
the captured substrings are "red king", "red", and "king",
and are numbered 1, 2, and 3.
</p>
<p class="para">
The fact that plain parentheses fulfill two functions is not
always helpful. There are often times when a grouping subpattern
is required without a capturing requirement. If an
opening parenthesis is followed by "?:", the subpattern does
not do any capturing, and is not counted when computing the
number of any subsequent capturing subpatterns. For example,
if the string "the white queen" is matched against the
pattern
<i>the ((?:red|white) (king|queen))</i>
the captured substrings are "white queen" and "queen", and
are numbered 1 and 2. The maximum number of captured substrings
is 99, and the maximum number of all subpatterns,
both capturing and non-capturing, is 200.
</p>
<p class="para">
As a convenient shorthand, if any option settings are
required at the start of a non-capturing subpattern, the
option letters may appear between the "?" and the ":". Thus
the two patterns
</p>
<pre class="literallayout">
(?i:saturday|sunday)
(?:(?i)saturday|sunday)
</pre>
<p class="para">
match exactly the same set of strings. Because alternative
branches are tried from left to right, and options are not
reset until the end of the subpattern is reached, an option
setting in one branch does affect subsequent branches, so
the above patterns match "SUNDAY" as well as "Saturday".
</p>
<p class="para">
It is possible to name the subpattern with
<i>(?P<name>pattern)</i> since PHP 4.3.3. Array with
matches will contain the match indexed by the string alongside the match
indexed by a number, then.
</p>
</div><?php manual_footer(); ?>