Syntax
The syntax for the REGEXP_SUBSTR function in Oracle is:
REGEXP_SUBSTR( string, pattern [, start_position [, nth_appearance [, match_parameter [, sub_expression ] ] ] ] )
Parameters or Arguments
- string
- The string to search. It can be CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB.
- pattern
- The regular expression matching information. It can be a combination of the following:
Value Description ^ Matches the beginning of a string. If used with a match_parameter of ‘m’, it matches the start of a line anywhere within expression. $ Matches the end of a string. If used with a match_parameter of ‘m’, it matches the end of a line anywhere within expression. * Matches zero or more occurrences. + Matches one or more occurrences. ? Matches zero or one occurrence. . Matches any character except NULL. | Used like an “OR” to specify more than one alternative. [ ] Used to specify a matching list where you are trying to match any one of the characters in the list. [^ ] Used to specify a nonmatching list where you are trying to match any character except for the ones in the list. ( ) Used to group expressions as a subexpression. {m} Matches m times. {m,} Matches at least m times. {m,n} Matches at least m times, but no more than n times. \n n is a number between 1 and 9. Matches the nth subexpression found within ( ) before encountering \n. [..] Matches one collation element that can be more than one character. [::] Matches character classes. [==] Matches equivalence classes. \d Matches a digit character. \D Matches a nondigit character. \w Matches a word character. \W Matches a nonword character. \s Matches a whitespace character. \S matches a non-whitespace character. \A Matches the beginning of a string or matches at the end of a string before a newline character. \Z Matches at the end of a string. *? Matches the preceding pattern zero or more occurrences. +? Matches the preceding pattern one or more occurrences. ?? Matches the preceding pattern zero or one occurrence. {n}? Matches the preceding pattern n times. {n,}? Matches the preceding pattern at least n times. {n,m}? Matches the preceding pattern at least n times, but not more than m times. - start_position
- Optional. It is the position in string where the search will start. If omitted, it defaults to 1 which is the first position in the string.
- nth_appearance
- Optional. It is the nth appearance of pattern in string. If omitted, it defaults to 1 which is the first appearance of pattern in string.
- match_parameter
- Optional. It allows you to modify the matching behavior for the REGEXP_SUBSTR function. It can be a combination of the following:
Value Description ‘c’ Perform case-sensitive matching. ‘i’ Perform case-insensitive matching. ‘n’ Allows the period character (.) to match the newline character. By default, the period is a wildcard. ‘m’ expression is assumed to have multiple lines, where ^ is the start of a line and $ is the end of a line, regardless of the position of those characters in expression. By default, expression is assumed to be a single line. ‘x’ Whitespace characters are ignored. By default, whitespace characters are matched like any other character. - subexpression
- Optional. This is used when pattern has subexpressions and you wish to indicate which subexpression in pattern is the target. It is an integervalue from 0 to 9 indicating the subexpression to match on in pattern.
Returns
The REGEXP_SUBSTR function returns a string value.
If the REGEXP_SUBSTR function does not find any occurrence of pattern, it will return NULL.
Note
- If there are conflicting values provided for match_parameter, the REGEXP_SUBSTR function will use the last value.
- If you omit the match_behavior parameter, the REGEXP_SUBSTR function will use the NLS_SORT parameter to determine if it should use a case-sensitive search, it will assume that string is a single line, and assume the period character to match any character (not the newline character).
- See also the SUBSTR function.