diff options
Diffstat (limited to 'Utilities/PCRE/man/html/pcrebuild.3.html')
-rw-r--r-- | Utilities/PCRE/man/html/pcrebuild.3.html | 167 |
1 files changed, 167 insertions, 0 deletions
diff --git a/Utilities/PCRE/man/html/pcrebuild.3.html b/Utilities/PCRE/man/html/pcrebuild.3.html new file mode 100644 index 0000000..950a1f0 --- /dev/null +++ b/Utilities/PCRE/man/html/pcrebuild.3.html @@ -0,0 +1,167 @@ +<!-- manual page source format generated by PolyglotMan v3.2, -->
+<!-- available at http://polyglotman.sourceforge.net/ -->
+
+<html>
+<head>
+<title>PCRE(3) manual page</title>
+</head>
+<body bgcolor='white'>
+<a href='#toc'>Table of Contents</a><p>
+
+<h2><a name='sect0' href='#toc0'>Name</a></h2>
+PCRE - Perl-compatible regular expressions
+<h2><a name='sect1' href='#toc1'>Pcre Build-time Options</a></h2>
+ <p>
+This
+document describes the optional features of PCRE that can be selected when
+the library is compiled. They are all selected, or deselected, by providing
+options to the <b>configure</b> script that is run before the <b>make</b> command. The
+complete list of options for <b>configure</b> (which includes the standard ones
+such as the selection of the installation directory) can be obtained by
+running <p>
+ ./configure --help<br>
+ <p>
+The following sections describe certain options whose names begin with
+--enable or --disable. These settings specify changes to the defaults for the
+<b>configure</b> command. Because of the way that <b>configure</b> works, --enable and --disable
+always come in pairs, so the complementary option always exists as well,
+but as it specifies the default, it is not described.
+<h2><a name='sect2' href='#toc2'>Utf-8 Support</a></h2>
+ <p>
+To build
+PCRE with support for UTF-8 character strings, add <p>
+ --enable-utf8<br>
+ <p>
+to the <b>configure</b> command. Of itself, this does not make PCRE treat strings
+as UTF-8. As well as compiling PCRE with this option, you also have have
+to set the PCRE_UTF8 option when you call the <b>pcre_compile()</b> function.
+
+<h2><a name='sect3' href='#toc3'>Unicode Character Property Support</a></h2>
+ <p>
+UTF-8 support allows PCRE to process
+character values greater than 255 in the strings that it handles. On its
+own, however, it does not provide any facilities for accessing the properties
+of such characters. If you want to be able to use the pattern escapes \P,
+\p, and \X, which refer to Unicode character properties, you must add <p>
+ --enable-unicode-properties<br>
+ <p>
+to the <b>configure</b> command. This implies UTF-8 support, even if you have not
+explicitly requested it. <p>
+Including Unicode property support adds around
+90K of tables to the PCRE library, approximately doubling its size. Only
+the general category properties such as <i>Lu</i> and <i>Nd</i> are supported. Details
+are given in the <b>pcrepattern</b> documentation.
+<h2><a name='sect4' href='#toc4'>Code Value of Newline</a></h2>
+ <p>
+By
+default, PCRE treats character 10 (linefeed) as the newline character. This
+is the normal newline character on Unix-like systems. You can compile PCRE
+to use character 13 (carriage return) instead by adding <p>
+ --enable-newline-is-cr<br>
+ <p>
+to the <b>configure</b> command. For completeness there is also a --enable-newline-is-lf
+option, which explicitly specifies linefeed as the newline character.
+<h2><a name='sect5' href='#toc5'>Building
+Shared and Static Libraries</a></h2>
+ <p>
+The PCRE building process uses <b>libtool</b> to build
+both shared and static Unix libraries by default. You can suppress one of
+these by adding one of <p>
+ --disable-shared<br>
+ --disable-static<br>
+ <p>
+to the <b>configure</b> command, as required.
+<h2><a name='sect6' href='#toc6'>Posix Malloc Usage</a></h2>
+ <p>
+When PCRE is
+called through the POSIX interface (see the <b>pcreposix</b> documentation),
+additional working storage is required for holding the pointers to capturing
+substrings, because PCRE requires three integers per substring, whereas
+the POSIX interface provides only two. If the number of expected substrings
+is small, the wrapper function uses space on the stack, because this is
+faster than using <b>malloc()</b> for each call. The default threshold above which
+the stack is no longer used is 10; it can be changed by adding a setting
+such as <p>
+ --with-posix-malloc-threshold=20<br>
+ <p>
+to the <b>configure</b> command.
+<h2><a name='sect7' href='#toc7'>Limiting Pcre Resource Usage</a></h2>
+ <p>
+Internally, PCRE
+has a function called <b>match()</b>, which it calls repeatedly (possibly recursively)
+when matching a pattern. By controlling the maximum number of times this
+function may be called during a single matching operation, a limit can
+be placed on the resources used by a single call to <b>pcre_exec()</b>. The limit
+can be changed at run time, as described in the <b>pcreapi</b> documentation.
+The default is 10 million, but this can be changed by adding a setting
+such as <p>
+ --with-match-limit=500000<br>
+ <p>
+to the <b>configure</b> command.
+<h2><a name='sect8' href='#toc8'>Handling Very Large Patterns</a></h2>
+ <p>
+Within a compiled
+pattern, offset values are used to point from one part to another (for
+example, from an opening parenthesis to an alternation metacharacter). By
+default, two-byte values are used for these offsets, leading to a maximum
+size for a compiled pattern of around 64K. This is sufficient to handle
+all but the most gigantic patterns. Nevertheless, some people do want to
+process enormous patterns, so it is possible to compile PCRE to use three-byte
+or four-byte offsets by adding a setting such as <p>
+ --with-link-size=3<br>
+ <p>
+to the <b>configure</b> command. The value given must be 2, 3, or 4. Using longer
+offsets slows down the operation of PCRE because it has to load additional
+bytes when handling them. <p>
+If you build PCRE with an increased link size,
+test 2 (and test 5 if you are using UTF-8) will fail. Part of the output
+of these tests is a representation of the compiled pattern, and this changes
+with the link size.
+<h2><a name='sect9' href='#toc9'>Avoiding Excessive Stack Usage</a></h2>
+ <p>
+PCRE implements backtracking
+while matching by making recursive calls to an internal function called
+<b>match()</b>. In environments where the size of the stack is limited, this can
+severely limit PCRE’s operation. (The Unix environment does not usually suffer
+from this problem.) An alternative approach that uses memory from the heap
+to remember data, instead of using recursive function calls, has been implemented
+to work round this problem. If you want to build a version of PCRE that
+works this way, add <p>
+ --disable-stack-for-recursion<br>
+ <p>
+to the <b>configure</b> command. With this configuration, PCRE will use the <b>pcre_stack_malloc</b>
+and <b>pcre_stack_free</b> variables to call memory management functions. Separate
+functions are provided because the usage is very predictable: the block
+sizes requested are always the same, and the blocks are always freed in
+reverse order. A calling program might be able to implement optimized functions
+that perform better than the standard <b>malloc()</b> and <b>free()</b> functions. PCRE
+runs noticeably more slowly when built in this way.
+<h2><a name='sect10' href='#toc10'>Using Ebcdic Code</a></h2>
+ <p>
+PCRE
+assumes by default that it will run in an environment where the character
+code is ASCII (or Unicode, which is a superset of ASCII). PCRE can, however,
+be compiled to run in an EBCDIC environment by adding <p>
+ --enable-ebcdic<br>
+ <p>
+to the <b>configure</b> command. <p>
+ Last updated: 09 September 2004 <br>
+Copyright (c) 1997-2004 University of Cambridge. <p>
+
+<hr><p>
+<a name='toc'><b>Table of Contents</b></a><p>
+<ul>
+<li><a name='toc0' href='#sect0'>Name</a></li>
+<li><a name='toc1' href='#sect1'>Pcre Build-time Options</a></li>
+<li><a name='toc2' href='#sect2'>Utf-8 Support</a></li>
+<li><a name='toc3' href='#sect3'>Unicode Character Property Support</a></li>
+<li><a name='toc4' href='#sect4'>Code Value of Newline</a></li>
+<li><a name='toc5' href='#sect5'>Building Shared and Static Libraries</a></li>
+<li><a name='toc6' href='#sect6'>Posix Malloc Usage</a></li>
+<li><a name='toc7' href='#sect7'>Limiting Pcre Resource Usage</a></li>
+<li><a name='toc8' href='#sect8'>Handling Very Large Patterns</a></li>
+<li><a name='toc9' href='#sect9'>Avoiding Excessive Stack Usage</a></li>
+<li><a name='toc10' href='#sect10'>Using Ebcdic Code</a></li>
+</ul>
+</body>
+</html>
|