Strip Phone: Part 2

This is a follow-up to the amazingly popular Strip Phone: Part 1. Review that if you need more clarity here.

Instead of using a pre-defined character class this time, we will roll our own character class.

In this puzzle, we will try to extract a phone number from a user input. Like most problems, there are many possible ways to approach this. There are even multiple ways to approach this with regular expressions. Consider the problem of non-normalized input of user telephone numbers (to just list a few):

  • 650-555-5555
  • 612.555.5555
  • +1612.555.5555
  • (612) 555 5555

Instead of being clever and matching all possible inputs for a phone number (possible, and we'll work towards that), let's begin by doing the reverse, and instead of identifying the numbers, let's just remove anything that's not a number.

One predefined character class that matches all digits is identified by: \d
The reverse this character class, one that matches all non-digits is identified with a capital 'D': \D *note the backslash escape character to signal the beginning of a character class

The equivalent hand-rolled character class to \d is: [0-9]

  • Character classes are contained within brackets: [ ]
  • Most characters inside of a character class will match literally. Ie, [a] matches a
  • If there is a dash between characters in a character class, it indicates all characters, inclusive, within that range. Ie,[a-z] matches all lower case characters (at least in ascii)
  • If the dash is the first character in a character class, then it is interpretted literally as a dash. Ie, [-abc] matches any character, '-', 'a', 'b', or 'c'
  • If the character class begins with a caret, '^', it indicates a reverse of all other characters indicated within it. That means essentially, it the character class will match if it DOES NOT match the characters in the class. Ie, [^abc] will match all characters that are not 'a', 'b', or 'c' (that's a lot of characters btw)

Similarly, the equivalent hand-rolled character class to \D is: [^0-9]

Note the caret '^' character at the beginning of the character class.

For this puzzle, we will use the String.replaceAll(String regex, String replacement) method. Instead of matching a pattern, we'll just remove all non-digit numbers. For the regex parameter in the String.replaceAll(String regex, String replacement) method, pass in a regular expression that will match all non-digit characters. If you were going to remove a character from a String, what would you replace the character with? That is what you should add as the 'replacement' parameter.

For further details, a good explanation of the details of how regular expressions are implemented in Java is found in the JavaDocs:


Answers are hidden from search engines.
regular-expression language=java