regex - Regular expression match if there's non-alphabetical character at the end, or nothing? -
i have regular expressions match homonyms, tw?oo?
match either two
, to
, or too
. (it matches twoo
, that's ok).
my question is, want regular expression match if there punctuation or other nonalphabetical character @ ends, to,
or two.
or ,too!
. if there's nothing @ end, that's ok well.
so want match tw?oo?
if there no other characters on each side, or if there non-alphabetical characters, not if there letters around: tomorrow
shouldn't match.
i tried [^a-za-z]?tw?oo?[^a-za-z]?
, since character classes optional ommitted.
how this, regex matches words if on own, or surrounded punctutation. (spaces aren't problem, they've been cut out)
thanks!
use word boundaries \b
. match whenever word character (\w
) , non-word character adjacent:
for (qw/two tomorrow/) { "$_ ", /\b(?:two|to|too)\b/ ? "matches" : "doesn't match"; }
output:
two matches matches tomorrow doesn't match
edit
i changed regex /\b(?:two|to|too)\b/
per tobyink's suggestion. more readable tw?oo?
, more correct tw?o+
, , triggers trie optimization, transforms part of regex efficient state machine.