diff options
author | dartraiden <wowemuh@gmail.com> | 2018-06-01 18:25:57 +0300 |
---|---|---|
committer | dartraiden <wowemuh@gmail.com> | 2018-06-01 18:26:31 +0300 |
commit | 0a55fa14f462169bbd8a8de623804f039854f95f (patch) | |
tree | 19fb2ef7ee1d7b6f3c80b3d83bc010733bc0f58f /libs/Pcre16/docs/doc/pcresyntax.3 | |
parent | 25f2c798a74bf6f72f2d6ba40e37a89c662204ba (diff) |
we only needs license, contributors and version info
Diffstat (limited to 'libs/Pcre16/docs/doc/pcresyntax.3')
-rw-r--r-- | libs/Pcre16/docs/doc/pcresyntax.3 | 540 |
1 files changed, 0 insertions, 540 deletions
diff --git a/libs/Pcre16/docs/doc/pcresyntax.3 b/libs/Pcre16/docs/doc/pcresyntax.3 deleted file mode 100644 index 0850369f7a..0000000000 --- a/libs/Pcre16/docs/doc/pcresyntax.3 +++ /dev/null @@ -1,540 +0,0 @@ -.TH PCRESYNTAX 3 "08 January 2014" "PCRE 8.35" -.SH NAME -PCRE - Perl-compatible regular expressions -.SH "PCRE REGULAR EXPRESSION SYNTAX SUMMARY" -.rs -.sp -The full syntax and semantics of the regular expressions that are supported by -PCRE are described in the -.\" HREF -\fBpcrepattern\fP -.\" -documentation. This document contains a quick-reference summary of the syntax. -. -. -.SH "QUOTING" -.rs -.sp - \ex where x is non-alphanumeric is a literal x - \eQ...\eE treat enclosed characters as literal -. -. -.SH "CHARACTERS" -.rs -.sp - \ea alarm, that is, the BEL character (hex 07) - \ecx "control-x", where x is any ASCII character - \ee escape (hex 1B) - \ef form feed (hex 0C) - \en newline (hex 0A) - \er carriage return (hex 0D) - \et tab (hex 09) - \e0dd character with octal code 0dd - \eddd character with octal code ddd, or backreference - \eo{ddd..} character with octal code ddd.. - \exhh character with hex code hh - \ex{hhh..} character with hex code hhh.. -.sp -Note that \e0dd is always an octal code, and that \e8 and \e9 are the literal -characters "8" and "9". -. -. -.SH "CHARACTER TYPES" -.rs -.sp - . any character except newline; - in dotall mode, any character whatsoever - \eC one data unit, even in UTF mode (best avoided) - \ed a decimal digit - \eD a character that is not a decimal digit - \eh a horizontal white space character - \eH a character that is not a horizontal white space character - \eN a character that is not a newline - \ep{\fIxx\fP} a character with the \fIxx\fP property - \eP{\fIxx\fP} a character without the \fIxx\fP property - \eR a newline sequence - \es a white space character - \eS a character that is not a white space character - \ev a vertical white space character - \eV a character that is not a vertical white space character - \ew a "word" character - \eW a "non-word" character - \eX a Unicode extended grapheme cluster -.sp -By default, \ed, \es, and \ew match only ASCII characters, even in UTF-8 mode -or in the 16- bit and 32-bit libraries. However, if locale-specific matching is -happening, \es and \ew may also match characters with code points in the range -128-255. If the PCRE_UCP option is set, the behaviour of these escape sequences -is changed to use Unicode properties and they match many more characters. -. -. -.SH "GENERAL CATEGORY PROPERTIES FOR \ep and \eP" -.rs -.sp - C Other - Cc Control - Cf Format - Cn Unassigned - Co Private use - Cs Surrogate -.sp - L Letter - Ll Lower case letter - Lm Modifier letter - Lo Other letter - Lt Title case letter - Lu Upper case letter - L& Ll, Lu, or Lt -.sp - M Mark - Mc Spacing mark - Me Enclosing mark - Mn Non-spacing mark -.sp - N Number - Nd Decimal number - Nl Letter number - No Other number -.sp - P Punctuation - Pc Connector punctuation - Pd Dash punctuation - Pe Close punctuation - Pf Final punctuation - Pi Initial punctuation - Po Other punctuation - Ps Open punctuation -.sp - S Symbol - Sc Currency symbol - Sk Modifier symbol - Sm Mathematical symbol - So Other symbol -.sp - Z Separator - Zl Line separator - Zp Paragraph separator - Zs Space separator -. -. -.SH "PCRE SPECIAL CATEGORY PROPERTIES FOR \ep and \eP" -.rs -.sp - Xan Alphanumeric: union of properties L and N - Xps POSIX space: property Z or tab, NL, VT, FF, CR - Xsp Perl space: property Z or tab, NL, VT, FF, CR - Xuc Univerally-named character: one that can be - represented by a Universal Character Name - Xwd Perl word: property Xan or underscore -.sp -Perl and POSIX space are now the same. Perl added VT to its space character set -at release 5.18 and PCRE changed at release 8.34. -. -. -.SH "SCRIPT NAMES FOR \ep AND \eP" -.rs -.sp -Arabic, -Armenian, -Avestan, -Balinese, -Bamum, -Bassa_Vah, -Batak, -Bengali, -Bopomofo, -Brahmi, -Braille, -Buginese, -Buhid, -Canadian_Aboriginal, -Carian, -Caucasian_Albanian, -Chakma, -Cham, -Cherokee, -Common, -Coptic, -Cuneiform, -Cypriot, -Cyrillic, -Deseret, -Devanagari, -Duployan, -Egyptian_Hieroglyphs, -Elbasan, -Ethiopic, -Georgian, -Glagolitic, -Gothic, -Grantha, -Greek, -Gujarati, -Gurmukhi, -Han, -Hangul, -Hanunoo, -Hebrew, -Hiragana, -Imperial_Aramaic, -Inherited, -Inscriptional_Pahlavi, -Inscriptional_Parthian, -Javanese, -Kaithi, -Kannada, -Katakana, -Kayah_Li, -Kharoshthi, -Khmer, -Khojki, -Khudawadi, -Lao, -Latin, -Lepcha, -Limbu, -Linear_A, -Linear_B, -Lisu, -Lycian, -Lydian, -Mahajani, -Malayalam, -Mandaic, -Manichaean, -Meetei_Mayek, -Mende_Kikakui, -Meroitic_Cursive, -Meroitic_Hieroglyphs, -Miao, -Modi, -Mongolian, -Mro, -Myanmar, -Nabataean, -New_Tai_Lue, -Nko, -Ogham, -Ol_Chiki, -Old_Italic, -Old_North_Arabian, -Old_Permic, -Old_Persian, -Old_South_Arabian, -Old_Turkic, -Oriya, -Osmanya, -Pahawh_Hmong, -Palmyrene, -Pau_Cin_Hau, -Phags_Pa, -Phoenician, -Psalter_Pahlavi, -Rejang, -Runic, -Samaritan, -Saurashtra, -Sharada, -Shavian, -Siddham, -Sinhala, -Sora_Sompeng, -Sundanese, -Syloti_Nagri, -Syriac, -Tagalog, -Tagbanwa, -Tai_Le, -Tai_Tham, -Tai_Viet, -Takri, -Tamil, -Telugu, -Thaana, -Thai, -Tibetan, -Tifinagh, -Tirhuta, -Ugaritic, -Vai, -Warang_Citi, -Yi. -. -. -.SH "CHARACTER CLASSES" -.rs -.sp - [...] positive character class - [^...] negative character class - [x-y] range (can be used for hex characters) - [[:xxx:]] positive POSIX named set - [[:^xxx:]] negative POSIX named set -.sp - alnum alphanumeric - alpha alphabetic - ascii 0-127 - blank space or tab - cntrl control character - digit decimal digit - graph printing, excluding space - lower lower case letter - print printing, including space - punct printing, excluding alphanumeric - space white space - upper upper case letter - word same as \ew - xdigit hexadecimal digit -.sp -In PCRE, POSIX character set names recognize only ASCII characters by default, -but some of them use Unicode properties if PCRE_UCP is set. You can use -\eQ...\eE inside a character class. -. -. -.SH "QUANTIFIERS" -.rs -.sp - ? 0 or 1, greedy - ?+ 0 or 1, possessive - ?? 0 or 1, lazy - * 0 or more, greedy - *+ 0 or more, possessive - *? 0 or more, lazy - + 1 or more, greedy - ++ 1 or more, possessive - +? 1 or more, lazy - {n} exactly n - {n,m} at least n, no more than m, greedy - {n,m}+ at least n, no more than m, possessive - {n,m}? at least n, no more than m, lazy - {n,} n or more, greedy - {n,}+ n or more, possessive - {n,}? n or more, lazy -. -. -.SH "ANCHORS AND SIMPLE ASSERTIONS" -.rs -.sp - \eb word boundary - \eB not a word boundary - ^ start of subject - also after internal newline in multiline mode - \eA start of subject - $ end of subject - also before newline at end of subject - also before internal newline in multiline mode - \eZ end of subject - also before newline at end of subject - \ez end of subject - \eG first matching position in subject -. -. -.SH "MATCH POINT RESET" -.rs -.sp - \eK reset start of match -.sp -\eK is honoured in positive assertions, but ignored in negative ones. -. -. -.SH "ALTERNATION" -.rs -.sp - expr|expr|expr... -. -. -.SH "CAPTURING" -.rs -.sp - (...) capturing group - (?<name>...) named capturing group (Perl) - (?'name'...) named capturing group (Perl) - (?P<name>...) named capturing group (Python) - (?:...) non-capturing group - (?|...) non-capturing group; reset group numbers for - capturing groups in each alternative -. -. -.SH "ATOMIC GROUPS" -.rs -.sp - (?>...) atomic, non-capturing group -. -. -. -. -.SH "COMMENT" -.rs -.sp - (?#....) comment (not nestable) -. -. -.SH "OPTION SETTING" -.rs -.sp - (?i) caseless - (?J) allow duplicate names - (?m) multiline - (?s) single line (dotall) - (?U) default ungreedy (lazy) - (?x) extended (ignore white space) - (?-...) unset option(s) -.sp -The following are recognized only at the very start of a pattern or after one -of the newline or \eR options with similar syntax. More than one of them may -appear. -.sp - (*LIMIT_MATCH=d) set the match limit to d (decimal number) - (*LIMIT_RECURSION=d) set the recursion limit to d (decimal number) - (*NO_AUTO_POSSESS) no auto-possessification (PCRE_NO_AUTO_POSSESS) - (*NO_START_OPT) no start-match optimization (PCRE_NO_START_OPTIMIZE) - (*UTF8) set UTF-8 mode: 8-bit library (PCRE_UTF8) - (*UTF16) set UTF-16 mode: 16-bit library (PCRE_UTF16) - (*UTF32) set UTF-32 mode: 32-bit library (PCRE_UTF32) - (*UTF) set appropriate UTF mode for the library in use - (*UCP) set PCRE_UCP (use Unicode properties for \ed etc) -.sp -Note that LIMIT_MATCH and LIMIT_RECURSION can only reduce the value of the -limits set by the caller of pcre_exec(), not increase them. -. -. -.SH "NEWLINE CONVENTION" -.rs -.sp -These are recognized only at the very start of the pattern or after option -settings with a similar syntax. -.sp - (*CR) carriage return only - (*LF) linefeed only - (*CRLF) carriage return followed by linefeed - (*ANYCRLF) all three of the above - (*ANY) any Unicode newline sequence -. -. -.SH "WHAT \eR MATCHES" -.rs -.sp -These are recognized only at the very start of the pattern or after option -setting with a similar syntax. -.sp - (*BSR_ANYCRLF) CR, LF, or CRLF - (*BSR_UNICODE) any Unicode newline sequence -. -. -.SH "LOOKAHEAD AND LOOKBEHIND ASSERTIONS" -.rs -.sp - (?=...) positive look ahead - (?!...) negative look ahead - (?<=...) positive look behind - (?<!...) negative look behind -.sp -Each top-level branch of a look behind must be of a fixed length. -. -. -.SH "BACKREFERENCES" -.rs -.sp - \en reference by number (can be ambiguous) - \egn reference by number - \eg{n} reference by number - \eg{-n} relative reference by number - \ek<name> reference by name (Perl) - \ek'name' reference by name (Perl) - \eg{name} reference by name (Perl) - \ek{name} reference by name (.NET) - (?P=name) reference by name (Python) -. -. -.SH "SUBROUTINE REFERENCES (POSSIBLY RECURSIVE)" -.rs -.sp - (?R) recurse whole pattern - (?n) call subpattern by absolute number - (?+n) call subpattern by relative number - (?-n) call subpattern by relative number - (?&name) call subpattern by name (Perl) - (?P>name) call subpattern by name (Python) - \eg<name> call subpattern by name (Oniguruma) - \eg'name' call subpattern by name (Oniguruma) - \eg<n> call subpattern by absolute number (Oniguruma) - \eg'n' call subpattern by absolute number (Oniguruma) - \eg<+n> call subpattern by relative number (PCRE extension) - \eg'+n' call subpattern by relative number (PCRE extension) - \eg<-n> call subpattern by relative number (PCRE extension) - \eg'-n' call subpattern by relative number (PCRE extension) -. -. -.SH "CONDITIONAL PATTERNS" -.rs -.sp - (?(condition)yes-pattern) - (?(condition)yes-pattern|no-pattern) -.sp - (?(n)... absolute reference condition - (?(+n)... relative reference condition - (?(-n)... relative reference condition - (?(<name>)... named reference condition (Perl) - (?('name')... named reference condition (Perl) - (?(name)... named reference condition (PCRE) - (?(R)... overall recursion condition - (?(Rn)... specific group recursion condition - (?(R&name)... specific recursion condition - (?(DEFINE)... define subpattern for reference - (?(assert)... assertion condition -. -. -.SH "BACKTRACKING CONTROL" -.rs -.sp -The following act immediately they are reached: -.sp - (*ACCEPT) force successful match - (*FAIL) force backtrack; synonym (*F) - (*MARK:NAME) set name to be passed back; synonym (*:NAME) -.sp -The following act only when a subsequent match failure causes a backtrack to -reach them. They all force a match failure, but they differ in what happens -afterwards. Those that advance the start-of-match point do so only if the -pattern is not anchored. -.sp - (*COMMIT) overall failure, no advance of starting point - (*PRUNE) advance to next starting character - (*PRUNE:NAME) equivalent to (*MARK:NAME)(*PRUNE) - (*SKIP) advance to current matching position - (*SKIP:NAME) advance to position corresponding to an earlier - (*MARK:NAME); if not found, the (*SKIP) is ignored - (*THEN) local failure, backtrack to next alternation - (*THEN:NAME) equivalent to (*MARK:NAME)(*THEN) -. -. -.SH "CALLOUTS" -.rs -.sp - (?C) callout - (?Cn) callout with data n -. -. -.SH "SEE ALSO" -.rs -.sp -\fBpcrepattern\fP(3), \fBpcreapi\fP(3), \fBpcrecallout\fP(3), -\fBpcrematching\fP(3), \fBpcre\fP(3). -. -. -.SH AUTHOR -.rs -.sp -.nf -Philip Hazel -University Computing Service -Cambridge CB2 3QH, England. -.fi -. -. -.SH REVISION -.rs -.sp -.nf -Last updated: 08 January 2014 -Copyright (c) 1997-2014 University of Cambridge. -.fi |