All Courses

Regular Expressions

Updated on Sep 2, 2025

32,980 Views

Regular expressions are used to define string patterns that can be used to search, manipulate and edit a text. These terms are also referred to as Regex(an acronymfor Regular expressions).

In the below example, the regular expression .*book.* is used for searching the occurrence of string “book” in the text.

import java.util.regex.*;  
class MyRegexExample{   
   public static void main(String args[]){   
    String content = "I am Ashish " + 
      "from Bangalore"; 
    String pattern = ".*from.*"; 
    boolean isMatch = Pattern.matches(pattern, content); 
    System.out.println("The text contains 'from'? " + isMatch); 
   } 
}   

Output:

The text contains 'from'? true

We'll learn how to identify patterns and how to use them in this tutorial. There are two primary classes in the java.util.regex API (the package we need to import while dealing with Regex).

java.util.regex.Pattern – Used for defining patterns
java.util.regex.Matcher – Used for performing match operations on text using patterns

java.util.regex.Pattern class:

Pattern.matches()

We have already seen the usage of this method in the above example where we performed the search for string “book” in a given text. This is one of the simplest and easiest ways of searching for a String in a text using Regex.

String content = "This is a tutorial Website!"; 
String patternString = ".*tutorial.*"; 
boolean isMatch = Pattern.matches(patternString, content); 
System.out.println("The text contains 'tutorial'? " + isMatch); 

As you can see we have used matches() method of Pattern class to search for the pattern in the given text. The pattern .*tutorial.* allows zero or more characters at the beginning and end of the String “tutorial” (the expression .* is used for zero and more characters).

Limitations: This way we can search for a single occurrence of a pattern in a text. For matching multiple occurrences you should use the Pattern.compile() method (discussed in the next section).

Pattern.compile()

In the above example we searched for a string “tutorial” in the text, that is a case sensitive search, however if you want to do a CASE INSENSITIVE search or want to do multiple occurrences of search, then you may need to first compile the pattern using Pattern.compile() before searching it in text. This is how this method can be used for this case.

String content = "This is a tutorial Website!"; 
String patternString = ".*tuToRiAl."; 
Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE); 

Here we have used a flag Pattern.CASE_INSENSITIVE for case insensitive search, there are several other flags that can be used for different-2 purposes.

Pattern.matcher()

In the above section we learnt how to get a Pattern instance using compile() method. Here we will learn How to get Matcher instance from Pattern instance by using matcher() method.

String content = "This is a tutorial Website!"; 
String patternString = ".*tuToRiAl.*"; 
Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE); 
Matcher matcher = pattern.matcher(content); 
boolean isMatched = matcher.matches(); 
System.out.println("Is it a Match?" + isMatched); 

Output:

Is it a Match?true

Pattern.split()

To split a text into multiple strings based on a delimiter (Here delimiter would be specified using regex), we can use Pattern.split() method. This is how it can be done.

import java.util.regex.*;   
class RegexExample2{   
public static void main(String args[]){   
    String text = "ThisIsChaitanya.ItISMyWebsite"; 
    // Pattern for delimiter 
    String patternString = "is"; 
    Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE); 
    String[] myStrings = pattern.split(text); 
    for(String temp: myStrings){ 
      System.out.println(temp); 
  } 
  System.out.println("Number of split strings: "+myStrings.length); 
}} 

Output:

Th 

Chaitanya.It 
MyWebsite 
Number of split strings: 4 

The second split String is null in the output.

java.util.regex.Matcher Class

We have already discussed a little bit about Matcher class above. Let’s recall a few things:

Creating a Matcher Instance

String content = "Some text"; 
String patternString = ".*somestring.*"; 
Pattern pattern = Pattern.compile(patternString); 
Matcher matcher = pattern.matcher(content); 

Main methods

1. matches(): It matches the regular expression against the whole text passed to the Pattern.matcher() method while creating a Matcher instance.

... 
Matcher matcher = pattern.matcher(content); 
boolean isMatch = matcher.matches(); 

2. lookingAt(): Similar to matches() method except that it matches the regular expression only against the beginning of the text, while matches() search in the whole text.

3. find(): Searches the occurrences of the regular expressions in the text. Mainly used when we are searching for multiple occurrences.

4.start() and end(): Both these methods are generally used along with the find() method. They are used for getting the start and end indexes of a match that is being found using find() method.

Let’s take an example to find out the multiple occurrences using Matcher methods:

package beginnersbook.com; 
import java.util.regex.*;   
class RegexExampleMatcher{   
public static void main(String args[]){   
  String content = "ZZZ AA PP AA QQQ AAA ZZ"; 
  String string = "AA"; 
  Pattern pattern = Pattern.compile(string); 
  Matcher matcher = pattern.matcher(content); 
  while(matcher.find()) { 
    System.out.println("Found at: "+ matcher.start() 
+  
" - " + matcher.end()); 
  } 
} 
} 

Output:

Found at: 4 - 6 
Found at: 10 - 12 
Found at: 17 - 19 

Now we are familiar with Pattern and Matcher class and the process of matching a regular expression against the text. Let’s see what kind of options we have to define a regular expression:

String Literals

Let’s say you just want to search a particular string in the text for e.g. “abc”; then we can simply write the code like this: Here text and regex both are the same.

Pattern.matches("abc", "abc")

Character Classes

A character class matches a single character in the input text against multiple allowed characters in the character class. For example [Cc]haitanya would match all the occurrences of String “chaitanya” with either lower case or upper case C”. Few more examples:

Pattern.matches("[pqr]", "abcd"); It would give false as no p,q or r in the text
Pattern.matches("[pqr]", "r"); Return true as r is found
Pattern.matches("[pqr]", "pq"); Return false as any one of them can be in text not both.
Here is the complete list of various character classes constructs:
[abc]: It would match with text if the text is having either one of them(a,b or c) and only once.
[^abc]: Any single character except a, b, or c (^ denote negation)
[a-zA-Z]: a through z, or A through Z, inclusive (range)
[a-d[m-p]]: a through d, or m through p: [a-dm-p] (union)
[a-z&&[def]]: Any one of them (d, e, or f)
[a-z&&[^bc]]: a through z, except for b and c: [ad-z] (subtraction)
[a-z&&[^m-p]]: a through z, and not m through p: [a-lq-z] (subtraction)

Predefined Character Classes – Metacharacters

These are like short codes which you can use while writing regex.

ConstructDescription 
.   ->Any character (may or may not match line terminators) 
\d  ->A digit: [0-9] 
\D  ->A non-digit: [^0-9] 
\s  ->A whitespace character: [ \t\n\x0B\f\r] 
\S  ->A non-whitespace character: [^\s] 
\w  ->A word character: [a-zA-Z_0-9] 
\W  ->A non-word character: [^\w] 

For e.g.Pattern.matches("\\d", "1"); would return true

Pattern.matches("\\D", "z"); returns true

Pattern.matches(".p", "qp"); returns true, dot(.) represents any character

Boundary Matchers

^Matches the beginning of a line. 
$Matches then end of a line. 
\bMatches a word boundary. 
\BMatches a non-word boundary. 
\AMatches the beginning of the input text. 
\GMatches the end of the previous match 
\ZMatches the end of the input text except the final terminator if any. 
\zMatches the end of the input text. 

For e.g.Pattern.matches("^Hello$", "Hello"): returns true, Begins and ends with Hello

Pattern.matches("^Hello$", "Namaste! Hello"): returns false, does not begin with Hello

Pattern.matches("^Hello$", "Hello Namaste!"): returns false, Does not end with Hello

Quantifiers

GreedyReluctantPossessiveMatches 
X?X??X?+Matches X once, or not at all (0 or 1 time). 
X*X*?X*+Matches X zero or more times. 
X+X+?X++Matches X one or more times. 
X{n}X{n}?X{n}+Matches X exactly n times. 
X{n,}X{n,}?X{n,}+Matches X at least n times. 
X{n, m)X{n, m)? X{n, m)+Matches X at least n time, but at most m times. 

Examples:

import java.util.regex.*;   
class RegexExample{   
public static void main(String args[]){   
   // It would return true if string matches exactly "tom" 
   System.out.println( 
    Pattern.matches("tom", "Tom")); //False 
/* returns true if the string matches exactly  
    * "tom" or "Tom" 
    */ 
System.out.println( 
  Pattern.matches("[Tt]om", "Tom")); //True 
System.out.println( 
  Pattern.matches("[Tt]om", "Tom")); //True 
/* Returns true if the string matches exactly "tim"  
    * or "Tim" or "jin" or "Jin" 
    */ 
System.out.println( 
Pattern.matches("[tT]im|[jJ]in", "Tim"));//True 
System.out.println( 
Pattern.matches("[tT]im|[jJ]in", "jin"));//True 
/* returns true if the string contains "abc" at  
    * any place 
    */ 
System.out.println( 
Pattern.matches(".*abc.*", "deabcpq"));//True 
/* returns true if the string does not have a  
    * number at the beginning 
    */ 
System.out.println( 
  Pattern.matches("^[^\\d].*", "123abc")); //False 
System.out.println( 
  Pattern.matches("^[^\\d].*", "abc123")); //True 
// returns true if the string contains of three letters 
System.out.println( 
  Pattern.matches("[a-zA-Z][a-zA-Z][a-zA-Z]", "aPz"));//True 
System.out.println( 
  Pattern.matches("[a-zA-Z][a-zA-Z][a-zA-Z]", "aAA"));//True 
System.out.println( 
  Pattern.matches("[a-zA-Z][a-zA-Z][a-zA-Z]", "apZx"));//False 
// returns true if the string contains 0 or more non-digits 
System.out.println( 
  Pattern.matches("\\D*", "abcde")); //True 
System.out.println( 
  Pattern.matches("\\D*", "abcde123")); //False 
/* Boundary Matchers example 
    * ^ denotes start of the line 
    * $ denotes end of the line 
    */ 
System.out.println( 
  Pattern.matches("^This$", "This is Chaitanya")); //False 
System.out.println( 
  Pattern.matches("^This$", "This")); //True 
System.out.println( 
  Pattern.matches("^This$", "Is This Chaitanya")); //False 
} 
} 

Full Name*

Email*

+91

Phone Number*

United States +1

India +91

Canada +1

Australia +61

Singapore +65

New Zealand +64

Germany +49

United Arab Emirates +971

Hong Kong +852

Ireland +353

Afghanistan +93

Aland Islands +358

Albania +355

Algeria +213

AmericanSamoa +1684

Andorra +376

Angola +244

Anguilla +1264

Antarctica +672

Antigua and Barbuda +1268

Argentina +54

Armenia +374

Aruba +297

Ascension Island +247

Austria +43

Azerbaijan +994

Bahamas +1242

Bahrain +973

Bangladesh +880

Barbados +1246

Belarus +375

Belgium +32

Belize +501

Benin +229

Bermuda +1441

Bhutan +975

Bolivia +591

Bosnia and Herzegovina +387

Botswana +267

Brazil +55

British Indian Ocean Territory +246

Brunei Darussalam +673

Bulgaria +359

Burkina Faso +226

Burundi +257

Cambodia +855

Cameroon +237

Cape Verde +238

Cayman Islands +1345

Central African Republic +236

Chad +235

Chile +56

China +86

Christmas Island +61

Cocos (Keeling) Islands +61

Colombia +57

Comoros +269

Congo +242

Cook Islands +682

Costa Rica +506

Cote d'Ivoire +225

Croatia +385

Cuba +53

Cyprus +357

Czech Republic +420

Democratic Republic of the Congo +243

Denmark +45

Djibouti +253

Dominica +1767

Dominican Republic +1849

Ecuador +593

Egypt +20

El Salvador +503

Equatorial Guinea +240

Eritrea +291

Estonia +372

Eswatini +268

Ethiopia +251

Falkland Islands (Malvinas) +500

Faroe Islands +298

Fiji +679

Finland +358

France +33

French Guiana +594

French Polynesia +689

Gabon +241

Gambia +220

Georgia +995

Ghana +233

Gibraltar +350

Greece +30

Greenland +299

Grenada +1473

Guadeloupe +590

Guam +1671

Guatemala +502

Guernsey +44

Guinea +224

Guinea-Bissau +245

Guyana +592

Haiti +509

Holy See (Vatican City State) +379

Honduras +504

Hungary +36

Iceland +354

Indonesia +62

Iran +98

Iraq +964

Isle of Man +44

Israel +972

Italy +39

Jamaica +1876

Japan +81

Jersey +44

Jordan +962

Kazakhstan +77

Kenya +254

Kiribati +686

Korea, Democratic People's Republic of Korea +850

Korea, Republic of South Korea +82

Kosovo +383

Kyrgyzstan +996

Laos +856

Latvia +371

Lebanon +961

Lesotho +266

Liberia +231

Libya +218

Liechtenstein +423

Lithuania +370

Luxembourg +352

Macau +853

Madagascar +261

Malawi +265

Malaysia +60

Maldives +960

Mali +223

Malta +356

Marshall Islands +692

Martinique +596

Mauritania +222

Mauritius +230

Mayotte +262

Mexico +52

Micronesia, Federated States of Micronesia +691

Moldova +373

Monaco +377

Mongolia +976

Montenegro +382

Montserrat +1664

Morocco +212

Mozambique +258

Myanmar +95

Namibia +264

Nauru +674

Nepal +977

Netherlands +31

New Caledonia +687

Nicaragua +505

Niger +227

Nigeria +234

Niue +683

Norfolk Island +672

North Macedonia +389

Northern Mariana Islands +1670

Norway +47

Oman +968

Pakistan +92

Palau +680

Palestine +970

Papua New Guinea +675

Paraguay +595

Peru +51

Philippines +63

Pitcairn +872

Poland +48

Portugal +351

Puerto Rico +1939

Qatar +974

Reunion +262

Romania +40

Russia +7

Rwanda +250

Saint Barthelemy +590

Saint Helena, Ascension and Tristan Da Cunha +290

Saint Kitts and Nevis +1869

Saint Lucia +1758

Saint Martin +590

Saint Pierre and Miquelon +508

Saint Vincent and the Grenadines +1784

Samoa +685

San Marino +378

Sao Tome and Principe +239

Saudi Arabia +966

Senegal +221

Serbia +381

Seychelles +248

Sierra Leone +232

Sint Maarten +1721

Slovakia +421

Slovenia +386

Solomon Islands +677

Somalia +252

South Africa +27

South Georgia and the South Sandwich Islands +500

South Sudan +211

Spain +34

Sri Lanka +94

Sudan +249

Suriname +597

Svalbard and Jan Mayen +47

Sweden +46

Switzerland +41

Syrian Arab Republic +963

Taiwan +886

Tajikistan +992

Tanzania, United Republic of Tanzania +255

Thailand +66

Timor-Leste +670

Togo +228

Tokelau +690

Tonga +676

Trinidad and Tobago +1868

Tunisia +216

Turkey +90

Turkmenistan +993

Turks and Caicos Islands +1649

Tuvalu +688

Uganda +256

Ukraine +380

United Kingdom +44

Uruguay +598

Uzbekistan +998

Vanuatu +678

Venezuela, Bolivarian Republic of Venezuela +58

Vietnam +84

Virgin Islands, British +1284

Virgin Islands, U.S. +1340

Wallis and Futuna +681

Yemen +967

Zambia +260

Zimbabwe +263

By Signing up, you agree to ourTerms & Conditionsand ourPrivacy and Policy

10% OFF

Coupon Code "GIFT10"

Coupon Expires 22/12

Copy

Get your free handbook for CSM!!

Recommended Courses