perl reference post
. # Any single character except a newline
^ # The beginning of the line or string
$ # The end of the line or string
* # Zero or more of the last character
+ # One or more of the last character
? # Zero or one of the last character
t.e # t followed by anthing followed by e
# This will match the
# tre
# tle
# but not te
# tale
^f # f at the beginning of a line
^ftp # ftp at the beginning of a line
e$ # e at the end of a line
tle$ # tle at the end of a line
und* # un followed by zero or more d characters
# This will match un
# und
# undd
# unddd (etc)
.* # Any string without a newline. This is because
# the . matches anything except a newline and
# the * means zero or more of these.
^$ # A line with nothing in it.
There are even more options. Square brackets are used to match any one of the characters inside them. Inside square brackets a - indicates "between" and a ^ at the beginning means "not":
[qjk] # Either q or j or k
[^qjk] # Neither q nor j nor k
[a-z] # Anything from a to z inclusive
[^a-z] # No lower case letters
[a-zA-Z] # Any letter
[a-z]+ # Any non-zero sequence of lower case letters
A vertical bar | represents an "or" and parentheses (...) can be used to group things together:
jelly|cream # Either jelly or cream
(eg|le)gs # Either eggs or legs
(da)+ # Either da or dada or dadada or...
\n # A newline
\t # A tab
\w # Any alphanumeric (word) character.
# The same as [a-zA-Z0-9_]
\W # Any non-word character.
# The same as [^a-zA-Z0-9_]
\d # Any digit. The same as [0-9]
\D # Any non-digit. The same as [^0-9]
\s # Any whitespace character: space,
# tab, newline, etc
\S # Any non-whitespace character
\b # A word boundary, outside [] only
\B # No word boundary
\| # Vertical bar
\[ # An open square bracket
\) # A closing parenthesis
\* # An asterisk
\^ # A carat symbol
\/ # A slash
\\ # A backslash
[01] # Either "0" or "1"
\/0 # A division by zero: "/0"
\/ 0 # A division by zero with a space: "/ 0"
\/\s0 # A division by zero with a whitespace:
# "/ 0" where the space may be a tab etc.
\/ *0 # A division by zero with possibly some
# spaces: "/0" or "/ 0" or "/ 0" etc.
\/\s*0 # A division by zero with possibly some
# whitespace.
\/\s*0\.0* # As the previous one, but with decimal
# point and maybe some 0s after it. Accepts
# "/0." and "/0.0" and "/0.00" etc and
# "/ 0." and "/ 0.0" and "/ 0.00" etc.
$_ = "Lord Whopper of Fibbing";
s/([A-Z])/:\1:/g;
:L:ord :W:hopper of :F:ibbing
if (/(\b.+\b) \1/)
{print "Found $1 repeated\n";}
The following swaps the first and last characters of a line in the $_ variable:
s/^(.)(.*)(.)$/\3\2\1/
After a match, you can use the special read-only variables $` and $& and $' to find what was matched before, during and after the seach.
$_ = "Lord Whopper of Fibbing";
/pp/;
$` eq "Lord Wo";
$& eq "pp";
$' eq "er of Fibbing";
$search = "the";
s/${search}re/xxx/;
The tr function allows character-by-character translation.
$sentence =~ tr/abc/edf/
tr/a-z/A-Z/;
stolen from: http://www.comp.leeds.ac.uk/Perl/
^ # The beginning of the line or string
$ # The end of the line or string
* # Zero or more of the last character
+ # One or more of the last character
? # Zero or one of the last character
t.e # t followed by anthing followed by e
# This will match the
# tre
# tle
# but not te
# tale
^f # f at the beginning of a line
^ftp # ftp at the beginning of a line
e$ # e at the end of a line
tle$ # tle at the end of a line
und* # un followed by zero or more d characters
# This will match un
# und
# undd
# unddd (etc)
.* # Any string without a newline. This is because
# the . matches anything except a newline and
# the * means zero or more of these.
^$ # A line with nothing in it.
There are even more options. Square brackets are used to match any one of the characters inside them. Inside square brackets a - indicates "between" and a ^ at the beginning means "not":
[qjk] # Either q or j or k
[^qjk] # Neither q nor j nor k
[a-z] # Anything from a to z inclusive
[^a-z] # No lower case letters
[a-zA-Z] # Any letter
[a-z]+ # Any non-zero sequence of lower case letters
A vertical bar | represents an "or" and parentheses (...) can be used to group things together:
jelly|cream # Either jelly or cream
(eg|le)gs # Either eggs or legs
(da)+ # Either da or dada or dadada or...
\n # A newline
\t # A tab
\w # Any alphanumeric (word) character.
# The same as [a-zA-Z0-9_]
\W # Any non-word character.
# The same as [^a-zA-Z0-9_]
\d # Any digit. The same as [0-9]
\D # Any non-digit. The same as [^0-9]
\s # Any whitespace character: space,
# tab, newline, etc
\S # Any non-whitespace character
\b # A word boundary, outside [] only
\B # No word boundary
\| # Vertical bar
\[ # An open square bracket
\) # A closing parenthesis
\* # An asterisk
\^ # A carat symbol
\/ # A slash
\\ # A backslash
[01] # Either "0" or "1"
\/0 # A division by zero: "/0"
\/ 0 # A division by zero with a space: "/ 0"
\/\s0 # A division by zero with a whitespace:
# "/ 0" where the space may be a tab etc.
\/ *0 # A division by zero with possibly some
# spaces: "/0" or "/ 0" or "/ 0" etc.
\/\s*0 # A division by zero with possibly some
# whitespace.
\/\s*0\.0* # As the previous one, but with decimal
# point and maybe some 0s after it. Accepts
# "/0." and "/0.0" and "/0.00" etc and
# "/ 0." and "/ 0.0" and "/ 0.00" etc.
$_ = "Lord Whopper of Fibbing";
s/([A-Z])/:\1:/g;
:L:ord :W:hopper of :F:ibbing
if (/(\b.+\b) \1/)
{print "Found $1 repeated\n";}
The following swaps the first and last characters of a line in the $_ variable:
s/^(.)(.*)(.)$/\3\2\1/
After a match, you can use the special read-only variables $` and $& and $' to find what was matched before, during and after the seach.
$_ = "Lord Whopper of Fibbing";
/pp/;
$` eq "Lord Wo";
$& eq "pp";
$' eq "er of Fibbing";
$search = "the";
s/${search}re/xxx/;
The tr function allows character-by-character translation.
$sentence =~ tr/abc/edf/
tr/a-z/A-Z/;
stolen from: http://www.comp.leeds.ac.uk/Perl/
Labels: nerd