Java Regular expressions regex tutorial

Regular expressions

Regular expressions represent a sequence of symbols and characters expressing a string or pattern to be searched for within a longer piece of text. The abbreviation for regular expression is regex. In programming regular expressions are mainly used to define constraints on strings like passwords, and email validation.

The java.util.regex package primarily consists of the following 1 interface and 3 classes:

  1. MatchResult interface
  2. Pattern class
  3. Matcher class
  4. PatternSyntaxException class

Pattern class

Pattern class is used to define a pattern for the regex. A pattern class object represents a compiled version of a regular expression.

Pattern class methods

Method Description
static Pattern compile(String regex) It compiles the given regex and return the instance of pattern.
Matcher matcher(CharSequence input) It creates a matcher that matches the given input with pattern.
static boolean matches(String regex, CharSequence input) It works as the combination of compile and matcher methods. It compiles the regular expression and matches the given input with the pattern.
String[] split(CharSequence input) It splits the given input string around matches of given pattern.
String pattern() It returns the regex pattern.

Matcher class

Matcher class object is the regex engine and used to perform match operations on a character sequence.

Matcher class methods

Method Description
boolean matches() It test whether the regular expression matches the pattern.
boolean find() It finds the next expression that matches the pattern.
boolean find(int start) It finds the next expression that matches the pattern from the given start number.
String group() It returns the matched subsequence.
int start() It returns the starting index of the matched subsequence.
int end() It returns the ending index of the matched subsequence.
int groupCount() It returns the total number of the matched subsequence.

PatternSyntaxException class

PatternSyntaxException class object represents an unchecked exception that refers to a syntax error in a regular expression pattern.

Regular Expression Syntax

Subexpression Matches
          ^ Matches the beginning of the line.
          $ Matches the end of the line.
          . Matches any single character except newline. Using m option allows it to match the newline as well.
          […] Matches any single character in brackets.
         [^…] Matches any single character not in brackets.
         A Beginning of the entire string.
         z End of the entire string.
         Z End of the entire string except allowable final line terminator.
         re* Matches 0 or more occurrences of the preceding expression.
         re+ Matches 1 or more of the previous thing.
        re? Matches 0 or 1 occurrence of the preceding expression.
       re{ n} Matches exactly n number of occurrences of the preceding expression.
      re{ n,} Matches n or more occurrences of the preceding expression.
      re{ n, m} Matches at least n and at most m occurrences of the preceding expression.
        a| b Matches either a or b.
        (re) Groups regular expressions and remembers the matched text.
        (?: re) Groups regular expressions without remembering the matched text.
         (?> re) Matches the independent pattern without backtracking.
           w Matches the word characters.
           W Matches the nonword characters.
            s Matches the whitespace. Equivalent to [tnrf].
            S Matches the nonwhitespace.
           d Matches the digits. Equivalent to [0-9].
           D Matches the nondigits.
          A Matches the beginning of the string.
           Z Matches the end of the string. If a newline exists, it matches just before newline.
           z Matches the end of the string.
          G Matches the point where the last match finished.
          n Back-reference to capture group number “n”.
          b Matches the word boundaries when outside the brackets. Matches the backspace (0x08) when inside the brackets.
          B Matches the nonword boundaries.
      n, t, etc. Matches newlines, carriage returns, tabs, etc.
           Q Escape (quote) all characters up to E.
          E Ends quoting begun with Q.

Java Regular Expressions Example

We can write a regular expression in 3 ways. Let us discuss these with the below example.

package com.w3schools;
 
import java.util.regex.Matcher;
import java.util.regex.Pattern;
 
public class RegexTest {
	public static void main(String args[]){
		//1st way  
		//. represents single character  
		Pattern p = Pattern.compile(".s");
		Matcher m = p.matcher("js");  
		boolean boolean1 = m.matches();  
		System.out.println(boolean1);   
		//2nd way  
		boolean boolean2=Pattern.compile(".s").matcher("js").matches();  
		System.out.println(boolean2);   
		//3rd way  
		boolean boolean3 = Pattern.matches(".s", "js");  
		System.out.println(boolean3);   
	}
}

Output

true
true
true

Related topics

Content Protection by DMCA.com