User:Merlin11188/Draft

From Legacy Roblox Wiki
Revision as of 20:23, 11 July 2011 by >Merlin11188 (→‎Modifiers)
Jump to navigationJump to search

Patterns

Patterns require some knowledge of string manipulation.


Classes

Character Class:

A character class is used to represent a set of characters. The following are character classes and their representations:

  • x — Where x is any non-magic character (^$()%.[]*+-?), x represents itself
  • . — Represents all characters (#32kas321fslk#?@34)
  • %a — Represents all letters (aBcDeFgHiJkLmNoPqRsTuVwXyZ)
  • %c — Represents all control characters (all ascii characters below 32 and ascii character 127)
  • %d — Represents all base-10 digits (1-10)
  • %l — Represents all lower-case letters (abcdefghijklmnopqrstuvwxyz)
  • %p — Represents all punctuation characters (#^;,.) etc.
  • %s — Represents all space characters
  • %u — Represents all upper-case letters (ABCDEFGHIJKLMNOPQRSTUVWXYZ)
  • %w — Represents all alpha-numeric characters (aBcDeFgHiJkLmNoPqRsTuVwXyZ0123456789)
  • %x — Represents all hexadecimal digits (0123456789ABCDEF)
  • %z — Represents the character with representation 0 (the null terminator)
  • %x — Represents (where x is any non-alphanumeric character) the character x. This is the standard way to escape the magic characters. Any punctuation character (even the non magic) can be preceded by a '%' when used to represent itself in a pattern. So, a percent sign in a string is "%%"

Here's an example:

Example
String="Ha! You'll never find any of these (323414123114452) numbers inside me!"
print(string.match(String, "%d")) -- Find a digit character

Output:
3


An upper-case version of any of these classes results in the complement of that class. For instance, %A will represent all non-letter characters. Here's another example:

Example
Martian="141341432431413415072343E234141241312"
print(Martian:match("%D")) -- Find a non-digit character

Output:
E

Modifiers

In Lua, modifiers are used for repetitions and optional parts. That's where they're useful; you can get more than one character at a time:

  • + — 1 or more repetitions
  • * — 0 or more repetitions
  • - — (minus sign) also 0 or more repetitions
  • ? — optional (0 or 1 occurrence)


I'll start with the simplest one: the ?. This makes the character class optional, and if it's there, captures 1 of it. That sounds complex, but is actually really simple, so here's an example:

Example
stringToMatch="Once upon a time, in a land far, far away..."
print(stringToMatch:match("%a?")) -- Find a letter, but it doesn't have to be there.
print(stringToMatch:match("%d?")) -- Find a number, but it doesn't have to be there.

Output:
O -- O, in Once.
--Nothing because the digit didn't need to be there, so nothing was returned.


The + symbol used after a character class requires at least one instance of that class, and will get the longest string of that class. Here's an example:

Example
stringToMatch="Once upon a time, in a land far, far away..."
print(stringToMatch:match("%a+")) -- Finds the first letter, then matches letters until a non-letter character
print(stringToMatch:match("%d+")) -- Finds the first number, then matches numbers until a non-number character

Output:
Once
nil -- Nil, because the pattern required the digit to be there, but it wasn't, which returns nil.


The * symbol used after a character class is like a combination of the + and ? modifiers. It matches the longest sequence of the character class, but it doesn't have to be there. Here's an example of it matching a floating-point (decimal) number, without requiring the decimal:

Example
numPattern="%d+%.?%d*"
--[[ Requires there to be a natural number (a digit >= 1), and if there's a decimal point, get it (remember: a period is magic character, so you have to escape it with the % sign), and if there are numbers after the decimal point, grab them. ]]

local num1="21608347 is an integer, a whole number, and a natural number!"
local num2="2034782.014873 is a decimal number!"
print(num1:match(numPattern))
print(num2:match(numPattern))

Output:
21608347 -- Grabbed a whole number, because there wasn't a decimal point or numbers after the decimal point
2034782.014873 -- Grabbed the floating-point number, because it had a decimal and numbers after it


The - symbol used after a character class is like the * symbol; there's only one difference, actually: It matches the shortest sequence of the character class. Here's an example showing the difference:

Example
String="((3+4)+3+4)+2"
print(String:match("%(.*%)")) -- Find a (, then match all (the . represens all characters) characters until the LAST ).
print(String:match("%(.-%)")) -- Find a (, then match all characters until the FIRST ).

Output:
((3+4)+3+4) -- Grabbed everything from the first parenthesis to the last closing parenthesis
((3+4) -- Grabbed everything from the first parenthesis to the first closing parenthesis