Oracle / PLSQL: REGEXP_INSTR Function

This article is written about how to use the Oracle/PLSQL REGEXP_INSTR function with syntax and examples.

Description

The Oracle/PLSQL REGEXP_INSTR characteristic is an extension of the INSTR function. It returns the location of a normal expression sample in a string. This function, added in Oracle 10g, will allow you to locate a substring in a string using everyday expression sample matching.

Syntax

The syntax for the REGEXP_INSTR feature in Oracle is:

REGEXP_INSTR( string, pattern [, start_position [, nth_appearance [, return_option [, match_parameter [, sub_expression ] ] ] ] ] )

Parameters or Arguments

string The string to search. string can be CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB. pattern The regular expression matching information. It can be a combination of the following: Value Description ^ Matches the commencing of a string. If used with a match_parameter of ‘m’, it fits the begin of a line anywhere within expression. $ Matches the give up of a string. If used with a match_parameter of ‘m’, it suits the quit of a line anywhere inside expression. * Matches zero or extra occurrences. + Matches one or extra occurrences. ? Matches zero or one occurrence. . Matches any personality without NULL. | Used like an “OR” to specify more than one alternative. [ ] Used to specify a matching list where you are making an attempt to fit any one of the characters in the list. [^ ] Used to specify a nonmatching listing where you are making an attempt to suit any character except for the ones in the list. ( ) Used to group expressions as a subexpression. {m} Matches m times. {m,} Matches at least m times. {m,n} Matches at least m times, but no greater than n times. \n n is a number between 1 and 9. Matches the nth subexpression discovered within ( ) before encountering \n. [..] Matches one collation component that can be extra than one character. [::] Matches persona classes. [==] Matches equivalence classes. \d Matches a digit character. \D Matches a nondigit character. \w Matches a phrase character. \W Matches a nonword character. \s Matches a whitespace character. \S suits a non-whitespace character. \A Matches the establishing of a string or matches at the give up of a string earlier than a newline character. \Z Matches at the end of a string. *? Matches the previous pattern zero or extra occurrences. +? Matches the previous pattern one or more occurrences. ?? Matches the previous sample zero or one occurrence. {n}? Matches the previous pattern n times. {n,}? Matches the previous sample at least n times. {n,m}? Matches the previous sample at least n times, but not extra than m times. start_position Optional. It is the function in string the place the search will start. If omitted, it defaults to 1 which is the first role in the string. nth_appearance Optional. It is the nth look of pattern in string. If omitted, it defaults to 1 which is the first appearance of pattern in string. return_option Optional. If a return_option of 0 is provided, the function of the first persona of the prevalence of sample is returned. If a return_option of 1 is provided, the position of the persona after the incidence of pattern is returned. If omitted, it defaults to 0. match_parameter Optional. It permits you to alter the matching behavior for the REGEXP_INSTR function. It can be a mixture of the following: Value Description ‘c’ Perform case-sensitive matching. ‘i’ Perform case-insensitive matching. ‘n’ Allows the length personality (.) to in shape the newline character. By default, the duration is a wildcard. ‘m’ expression is assumed to have a couple of lines, where ^ is the start of a line and $ is the give up of a line, regardless of the role of these characters in expression. By default, expression is assumed to be a single line. ‘x’ Whitespace characters are ignored. By default, whitespace characters are matched like any different character. subexpression Optional. This is used when sample has subexpressions and you desire to point out which subexpression in sample is the target. It is an integervalue from 0 to 9 indicating the subexpression to suit on in pattern.

Returns

The REGEXP_INSTR characteristic returns a numeric value. If the REGEXP_INSTR feature does now not locate any occurrence of pattern, it will return 0

Note

If there are conflicting values supplied for match_parameter, the REGEXP_INSTR feature will use the ultimate value. If you miss the match_behavior parameter, the REGEXP_INSTR characteristic will use the NLS_SORT parameter to determine if it use a case-sensitive search, it will assume that string is a single line, and expect the duration personality to healthy any character (not the newline character). See additionally the INSTR function.

Applies To

The REGEXP_INSTR characteristic can be used in the following variations of Oracle/PLSQL:

Oracle 12c, Oracle 11g, Oracle 10g

Example – Match on Single Character

Let’s start via searching at the easiest case. Let’s locate the role of the first ‘t’ character in a string.

For example:

SELECT REGEXP_INSTR ('TechOnTheNet is a great resource', 't')
FROM dual;

Result: 12

This instance will return 12 due to the fact it is performing a case-sensitive search of ‘t’. Therefore, it skips the ‘T’ characters and finds the first ‘t’ in the twelfth position.

If we desired to encompass both ‘t’ and ‘T’ in our results and operate a case-insensitive search, we may want to regulate our query as follows:

SELECT REGEXP_INSTR ('TechOnTheNet is a great resource', 't', 1, 1, 0, 'i')
FROM dual;

Result: 1

Now because we have supply a start_position of 1, an nth_appearance of 1, a return_option of 0, and a match_parameter of ‘i’, the question will return 1 as the result. This time, the function will search for both ‘t’ and ‘T’ values and return the first occurrence.

If we wanted to locate the first prevalence of the personality ‘t’ in a column, we should attempt some thing like this (case-insensitive search):

SELECT REGEXP_INSTR (last_name, 't', 1, 1, 0, 'i') AS First_Occurrence
FROM contacts;

This would return the first occurrence of ‘t’ or ‘T’ values in the last_name discipline from the contacts table.

Example – Match on Multiple Characters

Let’s seem to be next at how we would use the REGEXP_INSTR feature to in shape on a multi-character pattern.

For example:

SELECT REGEXP_INSTR ('The example shows how to use the REGEXP_INSTR function', 'ow', 1, 1, 0, 'i')
FROM dual;

Result: 15

This instance will return the first occurrence of ‘ow’ in the string. It will suit on the ‘ow’ in the phrase ‘shows’.

We may want to alternate the starting function of the search so that we operate the search starting from the middle of the string.

For example:

SELECT REGEXP_INSTR ('The example shows how to use the REGEXP_INSTR function', 'ow', 16, 1, 0, 'i')
FROM dual;

Result: 20

This instance will begin the search for the pattern of ‘ow’ at role 16 in the string. In this case, it will ignore over the first 15 characters in the string earlier than looking out for the pattern.

Now, let’s seem to be how we would use the REGEXP_INSTR feature with a desk column and search for a couple of characters.

For example:

SELECT REGEXP_INSTR (other_comments, 'the', 1, 1, 0, 'i')
FROM contacts;

In this example, we are going to search for the sample in the other_comments area in the contacts table.

Example – Match on more than one alternative

The next instance that we will seem at entails using the | pattern. The | sample is used like an “OR” to specify extra than one alternative.

For example:

SELECT REGEXP_INSTR ('Anderson', 'a|e|i|o|u')
FROM dual;

Result: 4

This example will return 4 due to the fact it is searching for the first vowel (a, e, i, o, or u) in the string. Since we did no longer specify a match_parameter value, the REGEXP_INSTR feature will perform a case-sensitive search which means that the ‘A’ in ‘Anderson’ will no longer be matched.

We should regulate our question as follows to operate a case-insensitive search as follows:

SELECT REGEXP_INSTR ('Anderson', 'a|e|i|o|u', 1, 1, 0, 'i')
FROM dual;

Result: 1

Now due to the fact we have provide a match_parameter of ‘i’, the question will return 1 as the result. This time, the ‘A’ in ‘Anderson’ will be discovered as a match.

Now, let’s shortly exhibit how you would use this feature with a column.

So let’s say we have a contact desk with the following data:

contact_id last_name 1000 Anderson 2000 Smith 3000 Johnson

Now, let’s run the following query:

SELECT contact_id, last_name, REGEXP_INSTR (last_name, 'a|e|i|o|u', 1, 1, 0, 'i') AS first_occurrence
FROM contacts;

These are the effects that would be back by the query:

contact_id last_name first_occurrence 1000 Anderson 1 2000 Smith 3 3000 Johnson 2

Example – Match on nth_occurrence

The next instance that we will seem to be at involves the nth_occurrence parameter. The nth_occurrence parameter permits you to pick out which incidence of the sample you wish to return the position of.

First Occurrence

Let’s seem to be at how to discover the first occurrence of a sample in a string.

For example:

SELECT REGEXP_INSTR ('TechOnTheNet', 'a|e|i|o|u', 1, 1, 0, 'i')
FROM dual;

Result: 2

This example will return two because it is looking out for the first occurrence of a vowel (a, e, i, o, or u) in the string.

Second Occurrence

Next, we will search for the second occurrence of a pattern in a string.

For example:

SELECT REGEXP_INSTR ('TechOnTheNet', 'a|e|i|o|u', 1, 2, 0, 'i')
FROM dual;

Result: 5

This example will return 5 because it is looking out for the 2d prevalence of a vowel (a, e, i, o, or u) in the string.

Third Occurrence

For example:

SELECT REGEXP_INSTR ('TechOnTheNet', 'a|e|i|o|u', 1, 3, 0, 'i')
FROM dual;

Result: 9

This example will return 9 because it is looking for the third occurrence of a vowel (a, e, i, o, or u) in the string.

Example – return_option parameter

Finally, let’s seem to be at how the return_option parameter influences our results.

For example:

SELECT REGEXP_INSTR ('TechOnTheNet', 'The', 1, 1, 0, 'i')
FROM dual;

Result: 7

In this fundamental example, we are looking out for a pattern in a string and the search is case-insensitive. We have distinct the return_option parameter as 0 which ability the position of the first character of the pattern will be returned.

Now, let’s change the return_option parameter to 1 and see what happens.

For example:

SELECT REGEXP_INSTR ('TechOnTheNet', 'The', 1, 1, 1, 'i')
FROM dual;

Result: 10

A return_option parameter of 1 tells the REGEXP_INSTR characteristic to return the position of the character following the matched pattern. In this example, the feature will return 10.