User:Merlin11188/Draft: Difference between revisions
>Merlin11188 No edit summary |
>Merlin11188 |
||
Line 39: | Line 39: | ||
</pre>}} | </pre>}} | ||
==Modifiers== | ==Modifiers== | ||
In Lua, | In Lua, modifiers are used for repetitions and optional parts. That's where they're useful; you can get more than one character at a time: | ||
* + — 1 or more repetitions | |||
* * — 0 or more repetitions | |||
* - — (minus sign) also 0 or more repetitions | |||
* ? — optional (0 or 1 occurrence) | |||
<br/> | |||
I'll start with the simplest one: the ?. This makes the character class optional, and if it's there, captures 1 of it. That sounds complex, but is actually really simple, so here's an example: | |||
{{Example|<pre> | |||
stringToMatch="Once upon a time, in a land far, far away..." | |||
print(stringToMatch:match("%a?")) -- Find a letter, but it doesn't have to be there. | |||
print(stringToMatch:match("%d?")) -- Find a number, but it doesn't have to be there. | |||
Output: | |||
O -- O, in Once. | |||
--Nothing because the digit didn't need to be there, so nothing was returned. | |||
</pre>}} | |||
<br/> | |||
The + symbol used after a character class requires at least one instance of that class, and will get the longest string of that class. Here's an example: | |||
{{Example|<pre> | |||
stringToMatch="Once upon a time, in a land far, far away..." | |||
print(stringToMatch:match("%a+")) -- Finds the first letter, then matches letters until a non-letter character | |||
print(stringToMatch:match("%d+")) -- Finds the first number, then matches numbers until a non-number character | |||
The | Output: | ||
Once | |||
nil -- Nil, because the pattern required the digit to be there, but it wasn't, which returns nil. | |||
</pre>}} | |||
<br/> | |||
The * symbol used after a character class is like a combination of the + and ? modifiers. It matches the longest sequence of the character class, but it doesn't have to be there. Here's an example of it matching a floating-point (decimal) number, without requiring the decimal: | |||
{{Example|<pre> | |||
numPattern="%d+%.?%d*" | |||
--[[ Requires there to be a natural number (a digit >= 1), and if there's a decimal point, get it (remember: a period is magic character, so you have to escape it with the % sign), and if there are numbers after the decimal point, grab them. ]] | |||
local num1="21608347 is an integer, a whole number, and a natural number!" | |||
local num2="2034782.014873 is a decimal number!" | |||
print(num1:match(numPattern)) | |||
print(num2:match(numPattern)) | |||
Output: | |||
21608347 -- Grabbed a whole number, because there wasn't a decimal point or numbers after the decimal point | |||
2034782.014873 -- Grabbed the floating-point number, because it had a decimal and numbers after it | |||
</pre>}} | |||
<br/> | |||
The - symbol used after a character class is like the * symbol; there's only one difference, actually: It matches the shortest sequence of the character class. Here's an example showing the difference: | |||
{{Example|<pre> | |||
String="((3+4)+3+4)+2" | |||
print(String:match("%(.*%)")) -- Find a (, then match all (the . represens all characters) characters until the LAST ). | |||
print(String:match("%(.-%)")) -- Find a (, then match all characters until the FIRST ). | |||
Output: | |||
((3+4)+3+4) -- Grabbed everything from the first parenthesis to the last closing parenthesis | |||
((3+4) -- Grabbed everything from the first parenthesis to the first closing parenthesis | |||
</pre>}} | |||
Revision as of 20:23, 11 July 2011
Patterns
Classes
Character Class:
A character class is used to represent a set of characters. The following are character classes and their representations:
- x — Where x is any non-magic character (^$()%.[]*+-?), x represents itself
- . — Represents all characters (#32kas321fslk#?@34)
- %a — Represents all letters (aBcDeFgHiJkLmNoPqRsTuVwXyZ)
- %c — Represents all control characters (all ascii characters below 32 and ascii character 127)
- %d — Represents all base-10 digits (1-10)
- %l — Represents all lower-case letters (abcdefghijklmnopqrstuvwxyz)
- %p — Represents all punctuation characters (#^;,.) etc.
- %s — Represents all space characters
- %u — Represents all upper-case letters (ABCDEFGHIJKLMNOPQRSTUVWXYZ)
- %w — Represents all alpha-numeric characters (aBcDeFgHiJkLmNoPqRsTuVwXyZ0123456789)
- %x — Represents all hexadecimal digits (0123456789ABCDEF)
- %z — Represents the character with representation 0 (the null terminator)
- %x — Represents (where x is any non-alphanumeric character) the character x. This is the standard way to escape the magic characters. Any punctuation character (even the non magic) can be preceded by a '%' when used to represent itself in a pattern. So, a percent sign in a string is "%%"
Here's an example:
String="Ha! You'll never find any of these (323414123114452) numbers inside me!" print(string.match(String, "%d")) -- Find a digit character Output: 3
An upper-case version of any of these classes results in the complement of that class. For instance, %A will represent all
non-letter characters. Here's another example:
Martian="141341432431413415072343E234141241312" print(Martian:match("%D")) -- Find a non-digit character Output: E
Modifiers
In Lua, modifiers are used for repetitions and optional parts. That's where they're useful; you can get more than one character at a time:
- + — 1 or more repetitions
- * — 0 or more repetitions
- - — (minus sign) also 0 or more repetitions
- ? — optional (0 or 1 occurrence)
I'll start with the simplest one: the ?. This makes the character class optional, and if it's there, captures 1 of it. That sounds complex, but is actually really simple, so here's an example:
stringToMatch="Once upon a time, in a land far, far away..." print(stringToMatch:match("%a?")) -- Find a letter, but it doesn't have to be there. print(stringToMatch:match("%d?")) -- Find a number, but it doesn't have to be there. Output: O -- O, in Once. --Nothing because the digit didn't need to be there, so nothing was returned.
The + symbol used after a character class requires at least one instance of that class, and will get the longest string of that class. Here's an example:
stringToMatch="Once upon a time, in a land far, far away..." print(stringToMatch:match("%a+")) -- Finds the first letter, then matches letters until a non-letter character print(stringToMatch:match("%d+")) -- Finds the first number, then matches numbers until a non-number character Output: Once nil -- Nil, because the pattern required the digit to be there, but it wasn't, which returns nil.
The * symbol used after a character class is like a combination of the + and ? modifiers. It matches the longest sequence of the character class, but it doesn't have to be there. Here's an example of it matching a floating-point (decimal) number, without requiring the decimal:
numPattern="%d+%.?%d*" --[[ Requires there to be a natural number (a digit >= 1), and if there's a decimal point, get it (remember: a period is magic character, so you have to escape it with the % sign), and if there are numbers after the decimal point, grab them. ]] local num1="21608347 is an integer, a whole number, and a natural number!" local num2="2034782.014873 is a decimal number!" print(num1:match(numPattern)) print(num2:match(numPattern)) Output: 21608347 -- Grabbed a whole number, because there wasn't a decimal point or numbers after the decimal point 2034782.014873 -- Grabbed the floating-point number, because it had a decimal and numbers after it
The - symbol used after a character class is like the * symbol; there's only one difference, actually: It matches the shortest sequence of the character class. Here's an example showing the difference:
String="((3+4)+3+4)+2" print(String:match("%(.*%)")) -- Find a (, then match all (the . represens all characters) characters until the LAST ). print(String:match("%(.-%)")) -- Find a (, then match all characters until the FIRST ). Output: ((3+4)+3+4) -- Grabbed everything from the first parenthesis to the last closing parenthesis ((3+4) -- Grabbed everything from the first parenthesis to the first closing parenthesis