5 min read By Vamsi Karuturi · Senior Backend Engineer at Salesforce

Java Regular Expressions

Q: What is the difference between matches(), find(), and lookingAt()?

matches() requires the entire input to match the pattern. Equivalent to wrapping the pattern with ^...$. find() searches for the next subsequence anywhere in the input that matches. Can be called repeatedly to find all matches. lookingAt() checks if the input starts with the pattern, but does not require it to consume the entire input. JavaPattern p = Pattern.compile("\\d+"); Matcher m = p.matcher("abc 123 def"); m.matches(); // false — "abc 123 def" is not all digi

Q: Why should you avoid String.matches() in a loop?

String.matches() internally calls Pattern.compile() every invocation. Pattern compilation involves parsing the regex, building an NFA, and optimizing it — this is expensive. In a loop processing thousands of strings, you pay this cost every iteration. The correct approach is to pre-compile the Pattern into a static final field and create new Matcher instances per input. Pattern is immutable and thread-safe; Matcher is not.

Q: Explain greedy vs reluctant vs possessive quantifiers with an example.

Given input "bold" and pattern <.+>: Greedy (<.+>): .+ consumes as much as possible (b>bold</b), then backtracks character by character until > matches. Result: bold (one match spanning the whole string). Reluctant (<.+?>): .+? consumes as little as possible (b), then checks if > matches. Result: and (two separate matches). Possessive (<.++>): .++ consumes as much as possible and refuses to backtrack.

Q: What is the difference between a capturing group and a non-capturing group?

Capturing group (X): Matches X and remembers the match. Accessible via matcher.group(n). Each capturing group is assigned a number (left-to-right by opening parenthesis). Non-capturing group (?:X): Matches X but does not remember it. Used for grouping alternation or applying quantifiers without the overhead of capturing. Use non-capturing groups when you only need logical grouping. This reduces memory usage and keeps group numbering clean. Java// Capturing: group(1) = "http" or "h

Q: How do lookahead and lookbehind work? Give a practical example.

Lookarounds are zero-width assertions — they check for a condition at a position without consuming any characters. (?=X) Positive lookahead: succeeds if X matches ahead (?!X) Negative lookahead: succeeds if X does NOT match ahead (?<=X) Positive lookbehind: succeeds if X matches behind (?<!X) Negative lookbehind: succeeds if X does NOT match behind Practical example — extract numbers that are preceded by $: JavaPattern p = Pattern.compile("(?<=\\$)\\d+\\.?\\d*"); Matcher m = p

Q: What is ReDoS and how do you prevent it?

ReDoS (Regular Expression Denial of Service) is a vulnerability where a malicious input causes the regex engine to enter catastrophic backtracking, consuming exponential time. It occurs with patterns that have: Nested quantifiers: (a+)+ Overlapping alternation: (a|a)+ Ambiguous repetition: (\w+\s?)+ Prevention strategies: Use possessive quantifiers (a++) or atomic groups ((?>a+)) to prevent backtracking Avoid nested quantifiers — flatten them Anchor patterns with ^ and $ Set a timeout or run

Q: What does the MULTILINE flag actually do? How is it different from DOTALL?

MULTILINE ((?m)) changes the behavior of ^ and $: Without it: ^ matches start of input, $ matches end of input With it: ^ also matches after \n, $ also matches before \n DOTALL ((?s)) changes the behavior of .: Without it: . matches any character except \n With it: . matches any character including \n They are independent flags. You can use both, one, or neither. JavaString input = "line1\nline2\nline3"; // Without MULTILINE: ^ only matches start of entire input Pattern.compile("^

Q: How would you extract all key-value pairs from a config string like 'host=localhost;port=8080;debug=true'?

Use named capturing groups for clarity: JavaPattern p = Pattern.compile("(?<key>\\w+)=(?<value>[^;]+)"); Matcher m = p.matcher("host=localhost;port=8080;debug=true"); Map<String, String> config = new LinkedHashMap<>(); while (m.find()) { config.put(m.group("key"), m.group("value")); } // {host=localhost, port=8080, debug=true} In Java 9+, you can use matcher.results() with streams for a more functional approach: JavaMap<String, S

Q: Is Pattern thread-safe? Is Matcher thread-safe?

Pattern is immutable and thread-safe. You should compile it once and store it in a static final field. Matcher is stateful and NOT thread-safe. It maintains internal state (current position, group captures, etc.). Each thread must create its own Matcher instance via pattern.matcher(input). Java// Correct concurrent usage private static final Pattern PATTERN = Pattern.compile("\\d+"); // Each thread creates its own Matcher public boolean validate(String input) { return PATTERN.matcher(i

Regular expressions (regex) are a powerful pattern-matching language used for searching, validating, and manipulating text. Java's java.util.regex package provides a robust regex engine based on Perl-style patterns. Regex is a frequent topic in interviews — interviewers test both your ability to write patterns and your understanding of the underlying engine.

Core Classes: Pattern and Matcher

Java's regex engine revolves around two classes:

Class	Purpose
`Pattern`	Compiled representation of a regex. Immutable and thread-safe.
`Matcher`	Stateful engine that performs match operations against a `CharSequence`.

Java

// 1. Compile the pattern (expensive — do once, reuse)
Pattern pattern = Pattern.compile("\\d{3}-\\d{4}");

// 2. Create a matcher against input
Matcher matcher = pattern.matcher("Call 555-1234 now");

// 3. Use matcher methods
if (matcher.find()) {
    System.out.println(matcher.group()); // "555-1234"
    System.out.println(matcher.start()); // 5
    System.out.println(matcher.end());   // 13
}

Key Matcher Methods

Method	Description
`matches()`	Entire input must match the pattern
`find()`	Finds the next subsequence that matches
`lookingAt()`	Input must match from the beginning (but need not consume all)
`group()`	Returns the matched subsequence
`start()` / `end()`	Start and end indices of the match
`replaceAll(String)`	Replaces every match with the replacement
`reset(CharSequence)`	Reuses the matcher with new input

Compilation Flags

Pass flags to Pattern.compile() to alter matching behavior.

Flag	Constant	Effect
`(?i)`	`Pattern.CASE_INSENSITIVE`	Case-insensitive matching (ASCII only)
`(?m)`	`Pattern.MULTILINE`	`^` and `$` match line boundaries, not just input boundaries
`(?s)`	`Pattern.DOTALL`	`.` matches everything including `\n`
`(?x)`	`Pattern.COMMENTS`	Allows whitespace and comments in the pattern
`(?u)`	`Pattern.UNICODE_CASE`	Unicode-aware case folding (use with `CASE_INSENSITIVE`)
`(?d)`	`Pattern.UNIX_LINES`	Only `\n` is recognized as a line terminator

Java

// Combine flags with bitwise OR
Pattern p = Pattern.compile("hello world",
        Pattern.CASE_INSENSITIVE | Pattern.MULTILINE);

// Or embed flags inline
Pattern p2 = Pattern.compile("(?im)hello world");

MULTILINE vs DOTALL

MULTILINE changes what ^ and $ mean — they match at line boundaries instead of only at the start/end of the entire input. DOTALL changes what . matches — it includes newline characters. These are independent and frequently confused in interviews.

Character Classes

Syntax	Matches	Example
`[abc]`	Any of a, b, or c	`[aeiou]` matches vowels
`[^abc]`	Anything except a, b, c	`[^0-9]` matches non-digits
`[a-z]`	Range a through z	`[A-Za-z]` matches letters
`[a-z&&[^m-p]]`	Intersection (a-z minus m-p)	Java-specific syntax
`.`	Any character (except `\n` by default)
`\d`	Digit `[0-9]`
`\D`	Non-digit `[^0-9]`
`\w`	Word character `[a-zA-Z0-9_]`
`\W`	Non-word character
`\s`	Whitespace `[ \t\n\r\f]`
`\S`	Non-whitespace
`\b`	Word boundary
`\B`	Non-word boundary

Double Backslash in Java

Java strings require escaping backslashes. Write \\d in code to represent the regex \d. This is the most common source of regex bugs in Java.

Quantifiers

Greedy, Reluctant, and Possessive

Greedy	Reluctant	Possessive	Meaning
`X?`	`X??`	`X?+`	Zero or one
`X*`	`X*?`	`X*+`	Zero or more
`X+`	`X+?`	`X++`	One or more
`X{n}`	`X{n}?`	`X{n}+`	Exactly n
`X{n,}`	`X{n,}?`	`X{n,}+`	At least n
`X{n,m}`	`X{n,m}?`	`X{n,m}+`	Between n and m

Java

String input = "<b>bold</b> and <i>italic</i>";

// Greedy: matches as much as possible, then backtracks
Pattern greedy = Pattern.compile("<.+>");
// Matches: "<b>bold</b> and <i>italic</i>"  (one huge match)

// Reluctant: matches as little as possible
Pattern reluctant = Pattern.compile("<.+?>");
// Matches: "<b>", "</b>", "<i>", "</i>"  (four separate matches)

// Possessive: matches as much as possible, NO backtracking
Pattern possessive = Pattern.compile("<.++>");
// No match at all! Consumes everything, won't give back the ">"

When to Use Possessive Quantifiers

Possessive quantifiers prevent backtracking entirely. Use them when you know the consumed characters will never be part of a later match — this dramatically improves performance and prevents catastrophic backtracking (ReDoS).

Anchors

Anchor	Meaning
`^`	Start of input (or line in `MULTILINE` mode)
`$`	End of input (or line in `MULTILINE` mode)
`\b`	Word boundary
`\B`	Non-word boundary
`\A`	Absolute start of input (ignores `MULTILINE`)
`\Z`	End of input, before final terminator
`\z`	Absolute end of input

Java

// \b ensures we match whole words only
Pattern p = Pattern.compile("\\bcat\\b");
Matcher m = p.matcher("concatenate the cat in the catalog");
// Finds only "cat" at position 18, not inside "concatenate" or "catalog"

Capturing Groups and Named Groups

Parentheses () create capturing groups. Groups are numbered left-to-right starting at 1. Group 0 is always the entire match.

Java

Pattern datePattern = Pattern.compile("(\\d{4})-(\\d{2})-(\\d{2})");
Matcher m = datePattern.matcher("Date: 2025-07-15");

if (m.find()) {
    System.out.println(m.group(0)); // "2025-07-15"  (full match)
    System.out.println(m.group(1)); // "2025"        (year)
    System.out.println(m.group(2)); // "07"          (month)
    System.out.println(m.group(3)); // "15"          (day)
}

Named Groups (Java 7+)

Use (?<name>...) to assign names to groups. Improves readability significantly.

Java

Pattern p = Pattern.compile(
    "(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})");
Matcher m = p.matcher("2025-07-15");

if (m.find()) {
    System.out.println(m.group("year"));  // "2025"
    System.out.println(m.group("month")); // "07"
    System.out.println(m.group("day"));   // "15"
}

Non-Capturing Groups

Use (?:...) when you need grouping for alternation or quantifiers but do not need to capture.

Java

// Non-capturing: groups for alternation without creating a capture
Pattern p = Pattern.compile("(?:http|https)://(.+)");
Matcher m = p.matcher("https://example.com");

if (m.find()) {
    System.out.println(m.group(1)); // "example.com" — group 1, not 2
}

Backreferences

Backreferences refer to previously captured groups within the same pattern. \1 refers to group 1, \k<name> refers to a named group.

Java

// Find repeated words
Pattern p = Pattern.compile("\\b(\\w+)\\s+\\1\\b", Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("The the quick brown fox fox jumped");

while (m.find()) {
    System.out.println("Duplicate: " + m.group()); 
    // "The the"
    // "fox fox"
}

Java

// Named backreference
Pattern p = Pattern.compile("(?<tag>\\w+)=\\k<tag>");
Matcher m = p.matcher("test=test value=value bad=wrong");

while (m.find()) {
    System.out.println(m.group()); // "test=test", "value=value"
}

Lookahead and Lookbehind Assertions

Lookarounds assert that a pattern exists (or does not exist) at a position without consuming characters.

Syntax	Name	Meaning
`(?=X)`	Positive lookahead	Followed by X
`(?!X)`	Negative lookahead	NOT followed by X
`(?<=X)`	Positive lookbehind	Preceded by X
`(?<!X)`	Negative lookbehind	NOT preceded by X

Java

// Password validation: at least 8 chars, one uppercase, one digit, one special
Pattern strongPassword = Pattern.compile(
    "^(?=.*[A-Z])(?=.*\\d)(?=.*[@#$%^&+=!]).{8,}$"
);

System.out.println(strongPassword.matcher("Passw0rd!").matches());  // true
System.out.println(strongPassword.matcher("weakpass").matches());   // false

Java

// Lookbehind: extract amounts after a dollar sign
Pattern p = Pattern.compile("(?<=\\$)\\d+\\.\\d{2}");
Matcher m = p.matcher("Price: $49.99 and $12.50");

while (m.find()) {
    System.out.println(m.group()); // "49.99", "12.50" (without the $)
}

Java

// Negative lookahead: find "foo" NOT followed by "bar"
Pattern p = Pattern.compile("foo(?!bar)");
Matcher m = p.matcher("foobar foobaz foo");

while (m.find()) {
    System.out.println(m.start()); // 7 ("foobaz"), 14 ("foo")
}

Lookbehind Limitations in Java

Java requires lookbehinds to have a finite, obvious length. Patterns like (?<=a+) are illegal. Use (?<=a{1,10}) or restructure with lookahead instead.

String Convenience Methods

Java's String class provides regex-powered methods for quick operations.

Java

String input = "Hello, World!";

// matches() — entire string must match
input.matches("[A-Za-z, !]+");  // true

// split() — split on a pattern
"one:two::three".split(":");     // ["one", "two", "", "three"]
"one:two::three".split(":", 3);  // ["one", "two", ":three"] (limit = 3 parts)
"one:two::three".split(":", -1); // ["one", "two", "", "three"] (keep trailing empties)

// replaceAll() — replace all matches
"2025-07-15".replaceAll("(\\d{4})-(\\d{2})-(\\d{2})", "$2/$3/$1");
// Result: "07/15/2025"

// replaceFirst() — replace first match only
"aaa bbb aaa".replaceFirst("aaa", "ccc");  // "ccc bbb aaa"

Pattern.compile() vs String.matches() Performance

This is a classic interview question. String.matches() recompiles the pattern every single call.

Java

// BAD: recompiles pattern on every iteration — O(n * compile_cost)
for (String email : emails) {
    if (email.matches("[\\w.]+@[\\w.]+\\.\\w+")) {
        // ...
    }
}

// GOOD: compile once, reuse — O(compile_cost + n * match_cost)
private static final Pattern EMAIL_PATTERN = 
    Pattern.compile("[\\w.]+@[\\w.]+\\.\\w+");

for (String email : emails) {
    if (EMAIL_PATTERN.matcher(email).matches()) {
        // ...
    }
}

Performance Rule of Thumb

If you use a regex more than once, always pre-compile with Pattern.compile() and store it in a static final field. The Pattern object is immutable and thread-safe, making this pattern safe for concurrent access.

Common Patterns

These are frequently asked in interviews. Understand the trade-offs — production-grade validation often requires more than regex alone.

Java

// Email (simplified — RFC 5322 is much more complex)
Pattern EMAIL = Pattern.compile(
    "^[\\w.%+-]+@[\\w.-]+\\.[a-zA-Z]{2,}$"
);

// US Phone Number (flexible format)
Pattern PHONE = Pattern.compile(
    "^\\(?\\d{3}\\)?[-.\\s]?\\d{3}[-.\\s]?\\d{4}$"
);

// IPv4 Address
Pattern IPV4 = Pattern.compile(
    "^((25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\\.){3}(25[0-5]|2[0-4]\\d|[01]?\\d\\d?)$"
);

// URL (http/https)
Pattern URL = Pattern.compile(
    "^https?://[\\w.-]+(?:\\.[a-zA-Z]{2,})(?:/[\\w./?%&=+-]*)?$"
);

// Java identifier
Pattern IDENTIFIER = Pattern.compile(
    "^[a-zA-Z_$][a-zA-Z0-9_$]*$"
);

// ISO date (YYYY-MM-DD)
Pattern ISO_DATE = Pattern.compile(
    "^\\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\\d|3[01])$"
);

Regex is Not a Validator

These patterns check format, not validity. For example, the date pattern accepts 2025-02-31 which does not exist. For production validation, use proper libraries (InternetAddress for email, InetAddress for IP, LocalDate.parse() for dates).

ReDoS: Catastrophic Backtracking

ReDoS (Regular Expression Denial of Service) occurs when a regex engine takes exponential time on certain inputs due to excessive backtracking.

The Problem

Java

// VULNERABLE: nested quantifiers create exponential backtracking
Pattern bad = Pattern.compile("(a+)+b");

// Benign input: matches quickly
bad.matcher("aaaaab").matches(); // true, fast

// Malicious input: no match, but engine tries every combination
bad.matcher("aaaaaaaaaaaaaaaaaaaaa!").matches(); 
// Takes seconds... minutes... hangs!

The engine tries a+ matching 1 char, then the outer + repeating... then backtracks and tries a+ matching 2 chars, and so on. For n characters, this is O(2^n).

Vulnerable Patterns to Avoid

Pattern	Problem
`(a+)+`	Nested quantifiers
`(a\\|b)*`	Alternation inside repetition with overlap
`(.*a){n}`	Wildcard with suffix inside repetition
`(\w+\s?)+`	Overlapping possibilities for each position

How to Prevent ReDoS

Use possessive quantifiers: (a++)b prevents backtracking entirely
Use atomic groups: (?>a+)b (Java supports this)
Avoid nested quantifiers: Flatten (a+)+ to a+
Anchor patterns: Use ^ and $ to constrain matching
Set timeouts: Limit regex execution time
Test with tools: Use regex analysis tools to detect vulnerable patterns

Java

// SAFE: possessive quantifier prevents backtracking
Pattern safe = Pattern.compile("(a++)b");
safe.matcher("aaaaaaaaaaaaaaaaaaaaa!").matches(); // false, instant

// SAFE: atomic group (equivalent to possessive)
Pattern atomic = Pattern.compile("(?>a+)b");
atomic.matcher("aaaaaaaaaaaaaaaaaaaaa!").matches(); // false, instant

Interview Red Flag

If an interviewer asks you to write a regex for user-facing input validation, always mention ReDoS as a concern. It shows security awareness and production-mindset thinking.

Java 9+ Regex Enhancements

Version	Feature
Java 9	`Matcher.results()` returns a `Stream<MatchResult>`
Java 9	`Scanner.findAll()` returns a `Stream<MatchResult>`
Java 9	`Pattern.asMatchPredicate()` — returns `Predicate<String>` (full match)
Java 11	`Pattern.asMatchPredicate()` stabilized

Java

// Java 9+: Stream-based matching
Pattern p = Pattern.compile("\\b\\w{5}\\b");
List<String> fiveLetterWords = p.matcher("Hello brave new world today")
    .results()
    .map(MatchResult::group)
    .collect(Collectors.toList());
// ["Hello", "brave", "world", "today"]

// Java 9+: Use as a predicate for filtering
Pattern emailPattern = Pattern.compile("[\\w.]+@[\\w.]+\\.\\w+");
Predicate<String> isEmail = emailPattern.asMatchPredicate();

List<String> validEmails = candidates.stream()
    .filter(isEmail)
    .collect(Collectors.toList());

Interview Questions

What is the difference between matches(), find(), and lookingAt()?

matches() requires the entire input to match the pattern. Equivalent to wrapping the pattern with ^...$.
find() searches for the next subsequence anywhere in the input that matches. Can be called repeatedly to find all matches.
lookingAt() checks if the input starts with the pattern, but does not require it to consume the entire input.

Java

Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher("abc 123 def");

m.matches();    // false — "abc 123 def" is not all digits
m.lookingAt();  // false — does not start with digits
m.find();       // true  — finds "123" within the input

Why should you avoid String.matches() in a loop?

String.matches() internally calls Pattern.compile() every invocation. Pattern compilation involves parsing the regex, building an NFA, and optimizing it — this is expensive. In a loop processing thousands of strings, you pay this cost every iteration.

The correct approach is to pre-compile the Pattern into a static final field and create new Matcher instances per input. Pattern is immutable and thread-safe; Matcher is not.

Explain greedy vs reluctant vs possessive quantifiers with an example.

Given input "bold" and pattern <.+>:

Greedy (<.+>): .+ consumes as much as possible (b>bold</b), then backtracks character by character until > matches. Result: bold (one match spanning the whole string).
Reluctant (<.+?>): .+? consumes as little as possible (b), then checks if > matches. Result:  and  (two separate matches).
Possessive (<.++>): .++ consumes as much as possible and refuses to backtrack. Since it consumed the final >, the pattern cannot match. Result: no match.

Possessive quantifiers are useful for performance — they fail fast when no match is possible and prevent catastrophic backtracking.

What is the difference between a capturing group and a non-capturing group?

Capturing group (X): Matches X and remembers the match. Accessible via matcher.group(n). Each capturing group is assigned a number (left-to-right by opening parenthesis).
Non-capturing group (?:X): Matches X but does not remember it. Used for grouping alternation or applying quantifiers without the overhead of capturing.

Use non-capturing groups when you only need logical grouping. This reduces memory usage and keeps group numbering clean.

Java

// Capturing: group(1) = "http" or "https"
Pattern p1 = Pattern.compile("(https?)://(.+)");

// Non-capturing: group(1) = the URL path directly
Pattern p2 = Pattern.compile("(?:https?)://(.+)");

How do lookahead and lookbehind work? Give a practical example.

Lookarounds are zero-width assertions — they check for a condition at a position without consuming any characters.

(?=X) Positive lookahead: succeeds if X matches ahead
(?!X) Negative lookahead: succeeds if X does NOT match ahead
(?<=X) Positive lookbehind: succeeds if X matches behind
(?<!X) Negative lookbehind: succeeds if X does NOT match behind

Practical example — extract numbers that are preceded by $:

Java

Pattern p = Pattern.compile("(?<=\\$)\\d+\\.?\\d*");
Matcher m = p.matcher("Items: $50, 30 units, $19.99");
// Finds: "50", "19.99" (without the dollar sign)

Practical example — password must contain at least one digit and one uppercase letter:

Java

Pattern p = Pattern.compile("^(?=.*[A-Z])(?=.*\\d).{8,}$");

Each lookahead checks a condition at position 0 without advancing the cursor.

What is ReDoS and how do you prevent it?

ReDoS (Regular Expression Denial of Service) is a vulnerability where a malicious input causes the regex engine to enter catastrophic backtracking, consuming exponential time.

It occurs with patterns that have:

Nested quantifiers: (a+)+
Overlapping alternation: (a|a)+
Ambiguous repetition: (\w+\s?)+

Prevention strategies:

Use possessive quantifiers (a++) or atomic groups ((?>a+)) to prevent backtracking
Avoid nested quantifiers — flatten them
Anchor patterns with ^ and $
Set a timeout or run regex in a separate thread with a deadline
Use static analysis tools to detect vulnerable patterns before deployment

What does the MULTILINE flag actually do? How is it different from DOTALL?

MULTILINE ((?m)) changes the behavior of ^ and $:

Without it: ^ matches start of input, $ matches end of input
With it: ^ also matches after \n, $ also matches before \n

DOTALL ((?s)) changes the behavior of .:

Without it: . matches any character except \n
With it: . matches any character including \n

They are independent flags. You can use both, one, or neither.

Java

String input = "line1\nline2\nline3";

// Without MULTILINE: ^ only matches start of entire input
Pattern.compile("^\\w+").matcher(input).find();      // "line1"

// With MULTILINE: ^ matches start of each line
Pattern.compile("(?m)^\\w+").matcher(input).results()
    .map(MatchResult::group).toList(); // ["line1", "line2", "line3"]

How would you extract all key-value pairs from a config string like 'host=localhost;port=8080;debug=true'?

Use named capturing groups for clarity:

Java

Pattern p = Pattern.compile("(?<key>\\w+)=(?<value>[^;]+)");
Matcher m = p.matcher("host=localhost;port=8080;debug=true");

Map<String, String> config = new LinkedHashMap<>();
while (m.find()) {
    config.put(m.group("key"), m.group("value"));
}
// {host=localhost, port=8080, debug=true}

In Java 9+, you can use matcher.results() with streams for a more functional approach:

Java

Map<String, String> config = p.matcher(input).results()
    .collect(Collectors.toMap(
        mr -> mr.group("key"),
        mr -> mr.group("value")
    ));

Is Pattern thread-safe? Is Matcher thread-safe?

Pattern is immutable and thread-safe. You should compile it once and store it in a static final field.

Matcher is stateful and NOT thread-safe. It maintains internal state (current position, group captures, etc.). Each thread must create its own Matcher instance via pattern.matcher(input).

Java

// Correct concurrent usage
private static final Pattern PATTERN = Pattern.compile("\\d+");

// Each thread creates its own Matcher
public boolean validate(String input) {
    return PATTERN.matcher(input).matches(); // new Matcher each call
}

Write a regex to validate an IPv4 address.

Each octet must be 0-255. This requires careful range handling:

Java

Pattern IPV4 = Pattern.compile(
    "^((25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\\.){3}" +
    "(25[0-5]|2[0-4]\\d|[01]?\\d\\d?)$"
);

IPV4.matcher("192.168.1.1").matches();    // true
IPV4.matcher("255.255.255.255").matches(); // true
IPV4.matcher("256.1.1.1").matches();       // false
IPV4.matcher("1.2.3").matches();           // false

Breakdown of each octet: 25[0-5] matches 250-255, 2[0-4]\d matches 200-249, [01]?\d\d? matches 0-199. The order matters because the regex engine tries alternatives left-to-right.

In production, prefer InetAddress.getByName() with validation — regex cannot catch semantic issues like reserved ranges.

Java Regular Expressions

Core Classes: Pattern and Matcher

Key Matcher Methods

Compilation Flags

Character Classes

Quantifiers

Greedy, Reluctant, and Possessive

Anchors

Capturing Groups and Named Groups

Named Groups (Java 7+)

Non-Capturing Groups

Backreferences

Lookahead and Lookbehind Assertions

String Convenience Methods

Pattern.compile() vs String.matches() Performance

Common Patterns

ReDoS: Catastrophic Backtracking

The Problem

Vulnerable Patterns to Avoid

How to Prevent ReDoS

Java 9+ Regex Enhancements

Interview Questions

5-Minute System Design — Weekly