diff options
author | Gluzskiy Alexandr <sss123next@gmail.com> | 2010-02-15 05:51:01 +0300 |
---|---|---|
committer | Gluzskiy Alexandr <sss123next@gmail.com> | 2010-02-15 05:51:01 +0300 |
commit | 7fd9fe181150f166a098eaf4e006f878c28cb770 (patch) | |
tree | 093af9d26a08e6bac60112a9c5f2f870ddef0fe8 /Utilities/PCRE/man/html/pcrecompat.3.html | |
parent | 6f32ef233b95d78efb905c97081193a2a454590e (diff) |
sort
Diffstat (limited to 'Utilities/PCRE/man/html/pcrecompat.3.html')
-rw-r--r-- | Utilities/PCRE/man/html/pcrecompat.3.html | 115 |
1 files changed, 115 insertions, 0 deletions
diff --git a/Utilities/PCRE/man/html/pcrecompat.3.html b/Utilities/PCRE/man/html/pcrecompat.3.html new file mode 100644 index 0000000..af67000 --- /dev/null +++ b/Utilities/PCRE/man/html/pcrecompat.3.html @@ -0,0 +1,115 @@ +<!-- manual page source format generated by PolyglotMan v3.2, -->
+<!-- available at http://polyglotman.sourceforge.net/ -->
+
+<html>
+<head>
+<title>PCRE(3) manual page</title>
+</head>
+<body bgcolor='white'>
+<a href='#toc'>Table of Contents</a><p>
+
+<h2><a name='sect0' href='#toc0'>Name</a></h2>
+PCRE - Perl-compatible regular expressions
+<h2><a name='sect1' href='#toc1'>Differences Between Pcre and
+Perl</a></h2>
+ <p>
+This document describes the differences in the ways that PCRE and
+Perl handle regular expressions. The differences described here are with
+respect to Perl 5.8. <p>
+1. PCRE does not have full UTF-8 support. Details of what
+it does have are given in the section on UTF-8 support in the main <b>pcre</b>
+ page. <p>
+2. PCRE does not allow repeat quantifiers on lookahead assertions.
+Perl permits them, but they do not mean what you might think. For example,
+(?!a){3} does not assert that the next three characters are not "a". It
+just asserts that the next character is not "a" three times. <p>
+3. Capturing
+subpatterns that occur inside negative lookahead assertions are counted,
+but their entries in the offsets vector are never set. Perl sets its numerical
+variables from any such patterns that are matched before the assertion
+fails to match something (thereby succeeding), but only if the negative
+lookahead assertion contains just one branch. <p>
+4. Though binary zero characters
+are supported in the subject string, they are not allowed in a pattern
+string because it is passed as a normal C string, terminated by zero. The
+escape sequence \0 can be used in the pattern to represent a binary zero.
+<p>
+5. The following Perl escape sequences are not supported: \l, \u, \L, \U, and
+\N. In fact these are implemented by Perl’s general string-handling and are
+not part of its pattern matching engine. If any of these are encountered
+by PCRE, an error is generated. <p>
+6. The Perl escape sequences \p, \P, and \X
+are supported only if PCRE is built with Unicode character property support.
+The properties that can be tested with \p and \P are limited to the general
+category properties such as Lu and Nd. <p>
+7. PCRE does support the \Q...\E escape
+for quoting substrings. Characters in between are treated as literals. This
+is slightly different from Perl in that $ and @ are also handled as literals
+inside the quotes. In Perl, they cause variable interpolation (but of course
+PCRE does not have variables). Note the following examples: <p>
+ Pattern
+ PCRE matches Perl matches<br>
+ <p>
+ \Qabc$xyz\E abc$xyz abc followed by the<br>
+ contents of $xyz<br>
+ \Qabc\$xyz\E abc\$xyz abc\$xyz<br>
+ \Qabc\E\$\Qxyz\E abc$xyz abc$xyz<br>
+ <p>
+The \Q...\E sequence is recognized both inside and outside character classes.
+<p>
+8. Fairly obviously, PCRE does not support the (?{code}) and (?p{code})
+constructions. However, there is support for recursive patterns using the
+non-Perl items (?R), (?number), and (?P>name). Also, the PCRE "callout" feature
+allows an external function to be called during pattern matching. See the
+ <b>pcrecallout</b> documentation for details. <p>
+9. There are some differences that
+are concerned with the settings of captured strings when part of a pattern
+is repeated. For example, matching "aba" against the pattern /^(a(b)?)+$/
+in Perl leaves $2 unset, but in PCRE it is set to "b". <p>
+10. PCRE provides
+some extensions to the Perl regular expression facilities: <p>
+(a) Although
+lookbehind assertions must match fixed length strings, each alternative
+branch of a lookbehind assertion can match a different length of string.
+Perl requires them all to have the same length. <p>
+(b) If PCRE_DOLLAR_ENDONLY
+is set and PCRE_MULTILINE is not set, the $ meta-character matches only
+at the very end of the string. <p>
+(c) If PCRE_EXTRA is set, a backslash followed
+by a letter with no special meaning is faulted. <p>
+(d) If PCRE_UNGREEDY is
+set, the greediness of the repetition quantifiers is inverted, that is,
+by default they are not greedy, but if followed by a question mark they
+are. <p>
+(e) PCRE_ANCHORED can be used at matching time to force a pattern to
+be tried only at the first matching position in the subject string. <p>
+(f)
+The PCRE_NOTBOL, PCRE_NOTEOL, PCRE_NOTEMPTY, and PCRE_NO_AUTO_CAPTURE options
+for <b>pcre_exec()</b> have no Perl equivalents. <p>
+(g) The (?R), (?number), and (?P>name)
+constructs allows for recursive pattern matching (Perl can do this using
+the (?p{code}) construct, which PCRE cannot support.) <p>
+(h) PCRE supports
+named capturing substrings, using the Python syntax. <p>
+(i) PCRE supports the
+possessive quantifier "++" syntax, taken from Sun’s Java package. <p>
+(j) The
+(R) condition, for testing recursion, is a PCRE extension. <p>
+(k) The callout
+facility is PCRE-specific. <p>
+(l) The partial matching facility is PCRE-specific.
+<p>
+(m) Patterns compiled by PCRE can be saved and re-used at a later time,
+even on different hosts that have the other endianness. <p>
+ Last updated: 09
+September 2004 <br>
+Copyright (c) 1997-2004 University of Cambridge. <p>
+
+<hr><p>
+<a name='toc'><b>Table of Contents</b></a><p>
+<ul>
+<li><a name='toc0' href='#sect0'>Name</a></li>
+<li><a name='toc1' href='#sect1'>Differences Between Pcre and Perl</a></li>
+</ul>
+</body>
+</html>
|