class: center, middle # Regular Expressions --- ## What Are Regular Expressions? - A pattern-matching language for testing strings -- - Given a pattern (e.g. "contains two digits in a row"), does a given string contain something that fits the pattern? -- - Most languages provide regex tools that allow a programmer to do things with regexes and matched substrings -- - We'll use JavaScript, but the same broad concepts apply to other languages, and regex syntax is similar from one language to another --- class: center, middle # JS Regex Syntax --- ## Basics In JS, a regular expression can be created as a literal by encolosing the pattern in `/` characters ``` let myRegex = /PATTERN/; ``` If a string contains a substring that fits the pattern, we say it "matches" the substring (or, lazily, that it matches the string itself) -- **Example** `/Java/` matches `"JavaScript"` --- ## Character Sets Character sets use `[ ]` to define a collection of characters to match -- - `/[abc]/` matches `"apple"` but not `"Apple"` or `"dog"` -- - `/[abc][def]d/` matches `"add"` and `"bed"` but not `"dad"` -- Negate a character class with `^` -- - `/[^abc]/` matches `"apple"` and `"dog"` but not `"a"` --- ## Ranges We can define ranges within a character set -- - `[a-z]` is the set of all lowercase letters -- - `[0-9]` is the set of digits -- - `[a-zA-Z0-9]` is the set of all alphanumeric characters --- ## Character classes Some commonly-used character sets have **character classes**, which are shorthands for various sets of the form `\x`, where `x` is one of a handful of recognized letters. -- - `\d` is the set of all digits -- - `\w` is the set of all "word characters" (letters, numbers, and underscore) -- - `\s` is the set of all whitespace characters (space, tab, newline, etc) -- We can use the negated form of a character class by capitalizing its assigned letter. E.g. `\S` is the set of all *non*-whitespace characters. -- The meta-character `.` can be thought of as a character class or set that matches *any* character (e.g. `/a.b/`) --- ## Quantifiers Quantifiers allow us to specify that a portion of a regex can be repeated some number of times to constitute a match. -- - `*` - zero or more times - `/dog*/` matches `"dog"`, `"do"`, and `"doggo"` -- - `+` - one or more times - `/dog+/` matches `"dog"` and `"doggo"` but not `"do"` -- - `?` - zero or one times (i.e. "optional") - `/dog?/` matches `"dog"`, `"do"`, and `"door"` --- ## More quantifiers The quantifier `{n,m}` specifies that a portion must be repeated between `n` and `m` times. Both integers are optional. -- - `/dog{1,2}/` matches `"dog"`, `"doggo"`, and `"doggggo"` but not `"do"` -- - `/dog{2,}/` matches `"doggo"` and `"doggggo"` but not `"dog"` --- ## Boundaries There are two special characters that match the beginning (`^`) and end (`$`) of a string -- - `/^Code/` matches `"Code Club"` but not `"LaunchCode"` -- - `/Code$/` matches `"LaunchCode"` but not `"Code Club"` -- - `/^Code$/` matches only `"Code"` (a "full string" match) --- ## Groups We can group portions of a regex using parens, `( )`, in order to apply quantifiers to a subexpression, or to "capture" matches of the subexpression (more on this later) -- - `/(ab.*)+/` matches `"abracadabra"`, `"abs"`, and `"abababab"` --- ## Flags We can use *flags* to turn on optional behavior for a regex. A flag is a letter that goes after the second `/`. -- `i` - carry out a case-insensitive search `g` - carry out a global search (i.e. search for all occurences of a pattern) --- ## JavaScript Regex Methods ### Search / validate - `String.prototype.search()` / [docs](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/search) / Does the string look like we expect? - `String.prototype.match()` / [docs](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match) / Which parts of the string look like we expect? -- ### Replace / transform - `String.prototype.replace()` / [docs](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace) / Replace matches with a new substring --- ## Example - Phone Number Validation What are valid US phone number formats that you would expect a user to be able to enter into an app? We will create a pattern to match *all* of them! --- ## Materials... ...are on GitHub. I'm @chrisbay. - [chrisbay/regex-slides](https://github.com/chrisbay/regex-slides) - this deck - [chrisbay/regex-practice](https://github.com/chrisbay/regex-practice) - some regex exercises in JS using TDD