downloads | documentation | faq | getting help | mailing lists | licenses | wiki | reporting bugs | php.net sites | links | conferences | my php.net

search for in the

Our source is open

The syntax highlighted source is automatically generated by PHP from the plaintext script. If you're interested in what's behind the several functions we used, you can always take a look at the source of the following files:

Of course, if you want to see the source of this page, we have it available. You can also browse the SVN repository for this website on svn.php.net.

Source of: /manual/en/regexp.reference.recursive.php

<?php
include_once $_SERVER['DOCUMENT_ROOT'] . '/include/shared-manual.inc';
$TOC = array();
$PARENTS = array();
include_once
dirname(__FILE__) ."/toc/reference.pcre.pattern.syntax.inc";
$setup = array (
 
'home' =>
  array (
   
0 => 'index.php',
   
1 => 'PHP Manual',
  ),
 
'head' =>
  array (
   
0 => 'UTF-8',
   
1 => 'en',
  ),
 
'this' =>
  array (
   
0 => 'regexp.reference.recursive.php',
   
1 => 'Recursive patterns',
  ),
 
'up' =>
  array (
   
0 => 'reference.pcre.pattern.syntax.php',
   
1 => 'Pattern Syntax',
  ),
 
'prev' =>
  array (
   
0 => 'regexp.reference.comments.php',
   
1 => 'Comments',
  ),
 
'next' =>
  array (
   
0 => 'regexp.reference.performances.php',
   
1 => 'Performances',
  ),
);
$setup["toc"] = $TOC;
$setup["parents"] = $PARENTS;
manual_setup($setup);

manual_header();
?>
<div id="regexp.reference.recursive" class="section">
     <h2 class="title">Recursive patterns</h2>
     <p class="para">
     Consider the problem of matching a  string  in  parentheses,
     allowing  for  unlimited nested parentheses. Without the use
     of recursion, the best that can be done is to use a  pattern
     that  matches  up  to some fixed depth of nesting. It is not
     possible to handle an arbitrary nesting depth. Perl 5.6  has
     provided   an  experimental  facility  that  allows  regular
     expressions to recurse (among other things).  The  special
     item (?R) is  provided for  the specific  case of recursion.
     This PCRE  pattern  solves the  parentheses  problem (assume
     the <a href="reference.pcre.pattern.modifiers.php" class="link">PCRE_EXTENDED</a>
     option is set so that white space is
     ignored):

       <i>\( ( (?&gt;[^()]+) | (?R) )* \)</i>
    </p>
    <p class="para">
     First it matches an opening parenthesis. Then it matches any
     number  of substrings which can either be a sequence of
     non-parentheses, or a recursive  match  of  the  pattern  itself
     (i.e. a correctly parenthesized substring). Finally there is
     a closing parenthesis.
    </p>
    <p class="para">
     This particular example pattern  contains  nested  unlimited
     repeats, and so the use of a once-only subpattern for matching
     strings of non-parentheses is  important  when  applying
     the  pattern to strings that do not match. For example, when
     it is applied to

       <i>(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa()</i>

     it yields &quot;no match&quot; quickly. However, if a  once-only  subpattern
     is  not  used,  the match runs for a very long time
     indeed because there are so many different ways the + and  *
     repeats  can carve up the subject, and all have to be tested
     before failure can be reported.
    </p>
    <p class="para">
     The values set for any capturing subpatterns are those  from
     the outermost level of the recursion at which the subpattern
     value is set. If the pattern above is matched against

       <i>(ab(cd)ef)</i>

     the value for the capturing parentheses is  &quot;ef&quot;,  which  is
     the  last  value  taken  on  at the top level. If additional
     parentheses are added, giving

       <i>\( ( ( (?&gt;[^()]+) | (?R) )* ) \)</i>
     then the string they capture
     is &quot;ab(cd)ef&quot;, the contents of the top level parentheses. If
     there are more than 15 capturing parentheses in  a  pattern,
     PCRE  has  to  obtain  extra  memory  to store data during a
     recursion, which it does by using  pcre_malloc,  freeing  it
     via  pcre_free  afterwards. If no memory can be obtained, it
     saves data for the first 15 capturing parentheses  only,  as
     there is no way to give an out-of-memory error from within a
     recursion.
     </p>
    
     <p class="para">
      Since PHP 4.3.3, <i>(?1)</i>, <i>(?2)</i> and so on
      can be used for recursive subpatterns too. It is also possible to use named
      subpatterns: <i>(?P&gt;name)</i> or
      <i>(?P&amp;name)</i>.
     </p>
     <p class="para">
      If the syntax for a recursive subpattern reference (either by number or
      by name) is used outside the parentheses to which it refers, it operates
      like a subroutine in a programming language. An earlier example
      pointed out that the pattern
      <i>(sens|respons)e and \1ibility</i>
      matches &quot;sense and sensibility&quot; and &quot;response and responsibility&quot;, but
      not &quot;sense and responsibility&quot;. If instead the pattern
      <i>(sens|respons)e and (?1)ibility</i>
      is used, it does match &quot;sense and responsibility&quot; as well as the other
      two strings. Such references must, however, follow the subpattern to
      which they refer.
     </p>
    
     <p class="para">
      The maximum length of a subject string is the largest positive number
      that an integer variable can hold. However, PCRE uses recursion to
      handle subpatterns and indefinite repetition. This means that the
      available stack space may limit the size of a subject string that can be
      processed by certain patterns.
     </p>
    
    </div><?php manual_footer(); ?>
 
show source | credits | sitemap | contact | advertising | mirror sites