INST 326 Module 5
Topics
Module 5 introduces a useful toolkit for all sorts of text processing, parsing, and searching. Regular Expressions, or regex, is a language for defining patterns of text that can be used to find one or more matches in a target string. Libraries exist for working with regular expressions in the context of many different programming environments, including in Python (the ‘re’ module).
Learning Outcomes
After completing this module, students should understand:
- The basics of writing regular expressions.
- How to test regexes using sites such as Regex 101.
- How to use regexes in Python programs through the re module.
- The main functions for applying regexes in Python: search(), match(), and findall().
Readings
- Severance, Python for Everybody, Chapter 11: Regular Expressions
- re (Python module documentation)
- Python 3 Regex Howto
Links
- Slides
- Exercises
- Lab
- Regex Cheat Sheet (downloadable PDF)
- Regular Expressions 101: Online regex tester and debugger – this site is super useful for developing and debugging regex patterns, since it allows you to test against sample text and see how the pattern works in practice
- Programming Historian: Understanding Regular Expressions, an introductory tutorial by Doug Knox
- “Regular expression”, an overview from Wikipedia: The Free Encyclopedia