View on GitHub


Presentation notes from JMU Unix Users Group meetings

Regular Expressions

Connor Sample -

Regex 101 documentation

What are regular expressions?

Common use cases

Basic Syntax

Basic Syntax - Character classes

Basic Syntax - Quantifiers

Escaping special characters


  1. Finding patterns in files:
    • Use grep to search for specific strings or patterns
    • grep 'error' logfile.txt
    • grep -E '([0-9]{1,3}\.){3}[0-9]{1,3}' access.log
  2. Replacing text in files:
    • Use sed or awk to perform find and replace operations
    • sed -i 's/\.html"/"/g' file.txt
  3. Validating input:
    • Ensure input adheres to specific formats or constraints
    • <input type="text" pattern="[A-Za-z]{3}"

Advanced Techniques

Word Boundary Marker

More Advanced Techniques

Example 1: Matching Digits

Text: “I have 3 apples and 5 oranges.”

Example 1: Solution

Regex: \d+

Example 2: Matching Words with 3 characters

\b can be used to match “word-boundaries”, which is the space around word characters (\w).

Text: “The quick brown fox jumps over the lazy dog.”

Example 2: Solution

Regex: \b\w{3}\b

Example 3: Matching Parts of a Date

Use capture groups to extract March, 5, and 2024. Text: March 5th, 2024

Example 3: Solution

Regex: ([A-Za-z]+)\s+(\d{1,2})(?:[A-Za-z]*),\s+(\d{4})

Example 4: Matching Email Addresses

Text: “Contact us at”

Simplified email specification:

Domain specification:

Use \b to ensure the email is standalone

Example 4: Solution

Regex: \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b

Best practices




import re

pattern = re.compile(r"""
\d+           # match one or more digits
\.            # match the `.` character
[A-Za-z]{5}   # match 5 letters of any case 
""", re.X)

pattern2 = re.compile(r"""
(?x)  # verbose mode
\d*   # match an optional digit
