|
|
(9 intermediate revisions by 3 users not shown) |
Line 1: |
Line 1: |
| ==What are String Patterns?== | | =Vector3int16= |
| String patterns are, in essence, just [[String|strings]]. What makes them different from ordinary strings then, you ask? String patterns are strings that use a special combination of characters. These characters combinations are generally used with functions in the string library such as 'string.match' and 'string.gsub' to do interesting things with strings. For instance, with string patterns you can do something like this:
| | {{Map|Scripting|Data Types}} |
| | __TOC__ <!-- TOC should be below the Map template. --> |
|
| |
|
| <pre>
| | {{type|Vector3int16}} is a variant of the {{type|Vector3}} datatype. {{type|Vector3int16}}, as its name implies, is a type whose coordinates are stored as 16 bit signed integers. What does this mean? It means that the coordinates of a {{type|Vector3int16}} must be in the range of -32767 to 32767. Additionally, the {{type|Vector3int16}} datatype is stripped down in terms of functionality. Currently, it is exclusively used for creating a {{type|Region3int16}} which in turn is used for using some of the {{type|instance=Terrain|terrain}} object's methods. |
| local s = "I am a string!"
| |
| for i in string.gmatch(s, "[^%s]+") do --Where "[^%s]+" is the string pattern.
| |
| print(i)
| |
| end
| |
|
| |
|
| Output:
| | ==[[Constructors]]== |
| I
| | {| class="wikitable" |
| am
| | ! Constructor !! Description |
| a | | |- |
| string!
| | | Vector3int16.new(<var>x</var>, <var>y</var>, <var>z</var>) || Creates a new {{type|Vector3int16}} using coordinates <var>x</var>, <var>y</var>, <var>z</var>. |
| </pre> | | |} |
|
| |
|
| But what makes the code above so cool? Perhaps you've wanted to make a list of people without using a [[Tables|table]], or maybe you need to [[Text_Parsing_Tutorial|parse]] a string. String patterns can help do this!
| | == Methods == |
| | Unlike the {{type|Vector3}} datatype, {{type|Vector3int16}} does not have any known methods. |
|
| |
|
| | == Properties == |
| | All of these properties are Read Only (you can't just set them Vector3int16.x = 5, it doesn't work) but you can create new vectors with such changes, or apply an operation, seen in the next section. |
|
| |
|
| ==The Basics of String Patterns== | | {| class="wikitable" |
| As said before, string patterns are strings that look a little different and are used for a different purpose than what strings are usually used for. Here we will look at the basics of just what make a string pattern up. Here we will look at just what the different parts of a string pattern mean.
| | ! Property !! Type !! Description |
| | |- |
| | | Vector3int16.'''x''' || {{type|number}} || The x-coordinate |
| | |- |
| | | Vector3int16.'''y''' || {{type|number}} || The y-coordinate |
| | |- |
| | | Vector3int16.'''z''' || {{type|number}} || The z-coordinate |
| | |} |
|
| |
|
| ===Character Classes=== | | == Operators == |
| Character classes in string patterns stand for a range or set of characters. Let's look at the classes listed below.
| | Unlike {{type|Vector3}}, you can only operate on a {{type|Vector3int16}} with another {{type|Vector3int16}}. |
|
| |
|
| *%a | | {| class="wikitable" |
| :*This character class represents all letters no matter if they're lowercase or uppercase.
| | ! Operator !! Description |
| :*Some examples are: 'a', 'd', 'F', and 'G'.
| | |- |
| | | {{type|Vector3int16}} + {{type|Vector3int16}} || returns Vector3int16 translated (slid) by Vector3int16 |
| | |- |
| | | {{type|Vector3int16}} - {{type|Vector3int16}} || returns Vector3int16 translated (slid) by -Vector3int16 (also gives relative position of 1 to the other) |
| | |- |
| | | {{type|Vector3int16}} * {{type|Vector3int16}} || returns Vector3int16 with each component multiplied by corresponding component |
| | |- |
| | | {{type|Vector3int16}} / {{type|Vector3int16}} || returns Vector3int16 with each component divided by corresponding component |
| | |} |
|
| |
|
| *%l
| | == See Also == |
| :*This character class represents all lowercase letters.
| | * [[Vector3]] |
| :*Some examples are: 'a', 'd', 'f', and 'g'.
| | * [[Region3]] |
| | |
| *%u
| |
| :*This character class represents all uppercase letters.
| |
| :*Some examples are: 'A', 'B', 'D', and 'Z'.
| |
| | |
| *%p
| |
| :*This character class represents all punctuation characters.
| |
| :*Some examples are: ".", "?", "+", and "/".
| |
| | |
| *%w
| |
| :*This character class represents all alphanumeric letters.
| |
| ::*This means that this class encompasses both letters and numbers.
| |
| :*Some examples are 'A', 'f', '3', and '7'.
| |
| | |
| *%d
| |
| :*This class represents all base 10 numbers.
| |
| :*Examples are '0', '1', '2' all the way up to '9'
| |
| | |
| *%s
| |
| :*This character class represents all space characters.
| |
| :*Some examples are ' ', '\n', and '\r'
| |
| | |
| *%c
| |
| :*This character class represents all control characters.
| |
| :*Control characters are characters with an ASCII code below 32 and also ASCII code 127
| |
| :*Control characters are all non-printing meaning that they don't represent a symbol representation.
| |
| | |
| *%x
| |
| :*This character class represents all hexadecimal (Base 16) characters.
| |
| :*Some examples are '21' which represents '!' and '5A' which represents 'Z'
| |
| | |
| *%z
| |
| :*This character class represents the character '\0'.
| |
| :*This character is commonly referred to as NUL.
| |
| | |
| *The dot character class
| |
| :*This class is represented by a single dot '.'
| |
| :*This class represents all characters, every single one.
| |
| :*Unlike the others, it is not preceded by a '%' sign.
| |
| | |
| | |
| As you can see, each of the character classes are used to represent a set of characters. Now let's look at some of the many things we can do with just these character classes.
| |
| | |
| | |
| Classes can also be used to represent a sequence of a type of characters. For instance, %d%l would match a number that is followed by a lowercase letter. Look at the following example:
| |
| <pre>
| |
| local s = "abc123"
| |
| local Pattern = "%a%a%a%d" --Matches three letters and a digit
| |
| print( string.match( s, Pattern ) )
| |
| | |
| Output:
| |
| abc1
| |
| </pre>
| |
| | |
| | |
| One of the things you might notice about the character classes I mentioned above, is that they are all lowercase. Making them capitals reverses their effect. For instance, %s represents spaces, but %S represents everything except space characters. %l represents lowercase letters which %L represents its compliment, all characters except those that are lowercase letters. Let's look at this example:
| |
| <pre>
| |
| local s1 = "a4-2" --Letter, not a letter, punctuation, not a letter
| |
| local s2 = "aA-2" --Letter, letter, punctuation, not a letter
| |
| local Pattern = "%a%A%p%A" --Matches a letter, not a letter, punctuation, and not a letter
| |
| print( string.match( s1, Pattern ) )
| |
| print( string.match( s2, Pattern ) )
| |
| | |
| Output:
| |
| a4-2
| |
| nil
| |
| </pre>
| |
| Why did it print out a4-2? It's because that s1 matched the pattern while s2 did not match the pattern.
| |
| | |
| ===Pattern Items===
| |
| Pattern items can be used to make your code simpler. Here are the pattern items and their definitions, we will explain them below.
| |
| | |
| :*a single character class, which matches a single character in the string
| |
| | |
| :*a single character class followed by a '+', which matches 1 or more repetitions in the string. These repetition items will always match the longest possible sequence.
| |
| | |
| :*a single character class followed by a '*' (asterisk), which matches 0 or more repetitions in the string. These repetition items will always match the longest possible sequence.
| |
| | |
| :*a single character class followed by a '-', which matches 0 or more repetitions in the string. These repetitions will always match the shortest possible sequence.
| |
| | |
| :*a single character class followed by a '?', which matches 0 or 1 occurrence of the string.
| |
| | |
| | |
| Now let's look at how to use them. In these examples, we will use the [[Function_Dump/String_Manipulation#string.match_.28s.2C_pattern_.5B.2C_init.5D.29|string.match]] function. Lets say you have a code like this:
| |
| <pre>
| |
| s = "1234567"
| |
| </pre>
| |
| | |
| Instead of using "%d%d%d%d%d%d%d", you can use pattern items. This is especially useful when you don't know exactly how long whatever you're retrieving is. Let's look at this code:
| |
| | |
| <pre>
| |
| local s = "1234567"
| |
| local Pattern = "%d+" --See how I used the '+' pattern item to make it shorter?
| |
| print( string.match( s, Pattern ) )
| |
| | |
| Output:
| |
| 1234567
| |
| </pre>
| |
| | |
| | |
| Now let's take a look at the next pattern item (*) which matches 0 or more repetitions and the longest sequence.
| |
| <pre>
| |
| local s1 = "1,!643"
| |
| local s2 = "12349"
| |
| local Pattern = "%d%p*%d"
| |
| print( string.match( s1, Pattern ) )
| |
| print( string.match( s2, Pattern ) )
| |
| | |
| Output:
| |
| 1,!6
| |
| 12
| |
| </pre>
| |
| As you can see, it matches a digit, punctuation characters (if there is one), and then another digit.
| |
| | |
| | |
| Now let's look at the third pattern item '-' with the example below:
| |
| <pre>
| |
| local s = "5ab2__0"
| |
| local Pattern1 = "%d.-%d"
| |
| local Pattern2 = "%d.*%d"
| |
| print( string.match( s, Pattern1 ) )
| |
| print( string.match( s, Pattern2 ) )
| |
| | |
| Output:
| |
| 5ab2
| |
| 5ab2__0
| |
| </pre>
| |
| As you can see, this pattern item does the same as the '*' pattern item except that it looks for the shortest possible sequence.
| |
| | |
| | |
| The '?' pattern item is much more different than the others because it matches 0 or 1 occurrence of the string.
| |
| <pre>
| |
| local s1 = "1.56"
| |
| local s2 = "7890"
| |
| local Pattern = "%d%p?%d"
| |
| print( string.match( s1, Pattern ) )
| |
| print( string.match( s2, Pattern ) )
| |
| | |
| Output:
| |
| 1.5
| |
| 78
| |
| </pre>
| |
| | |
| ===Sets===
| |
| Sets are used when a single character class cannot do the whole job. For instance, you might want to match '''both''' lowercase letters (%l) as well as punctuation characters (%p) using a single class. So how would we do this? Let's take a look at this example:
| |
| | |
| <pre>
| |
| local s = "123 Hello! I am another string."
| |
| local Pattern = "[%l%p]+"
| |
| print(string.match(s, Pattern))
| |
| | |
| Output:
| |
| >ello!
| |
| </pre>
| |
| | |
| As you can see from the example, sets are defined by the '[' and ']' around them. You also see that the classes for lowercase letters and punctuation is contained within. This means that the set will act as a class that represents both lowercase and punctuation, unlike if you used %l%p which would match the sequence of a punctuation character following a lowercase letter.
| |
| | |
| | |
| You aren't restricted to using only character classes, though! You can also use normal characters to add to the set. Also, you can specify a '''range''' of characters with the '-' symbol. Let's see how this works in the following example:
| |
| | |
| <pre>
| |
| local NormCharP = "[3_%l]+" --A set representing lowercase letters, a three, and an underscore that matches 0 or more repetitions.
| |
| local RangeP = "[1-4%u]+" --A set representing the numbers 1 to 4 as well as uppercase letters that matches 0 or more repetitions.
| |
| local s1 = "Random_123"
| |
| local s2 = "37913 Sandwiches!"
| |
| | |
| for i in string.gmatch(s1, NormCharP) do
| |
| print(i)
| |
| end
| |
| print("--Next--")
| |
| for i in string.gmatch(s2, RangeP) do
| |
| print(i)
| |
| end
| |
| | |
| Output:
| |
| andom_
| |
| 3
| |
| --Next--
| |
| 3
| |
| 13
| |
| S
| |
| </pre>
| |
| | |
| | |
| From the example, you can see how string.gmatch manipulated strings s1 and s2 using the string patterns. And yet, there's still one last thing you can do. Now you can see just how the example in the introduction works. Let's take a look at this code:
| |
| <pre>
| |
| local Pattern = "[^%s1-9]+"
| |
| local s = "He29ll0, I like strings1"
| |
| local temp = "
| |
| for i in string.gmatch(s, Pattern) do
| |
| temp = temp .. i
| |
| end
| |
| print(temp)
| |
| | |
| Output:
| |
| Hell0,Ilikestrings
| |
| </pre>
| |
| This pattern is the compliment of [%s1-9] meaning that it will represent all characters '''except''' the space characters and the numbers 1 to 9. This is defined by using the '^' character at the beginning of the set. All it does is makes the set act the direct opposite of a normal set. As you can easily see from this example, the spaces and number 29 in the middle of 'Hello' were removed.
| |
| | |
| ===Captures===
| |
| Captures are used to get pieces of a string that match a capture. Captures are defined by parentheses around them. For instance, (%a%s) is a capture for a letter and a space. When a capture is matched, it is then stored for future use. Let's look at this example:
| |
| <pre>
| |
| local s = "TwentyOne = 21"
| |
| local Pattern = "(%a+)%s=%s(%d+)"
| |
| _, _, key, val = string.find( s, Pattern ) --see how I used parenthesis to designate my captures? "key" is the first capture, and "val" is the second capture.
| |
| | |
| print( key, val )
| |
| | |
| Output:
| |
| >"TwentyOne 21" --See how it only printed the captures designated by the parenthesis?
| |
| </pre>
| |
| | |
| | |
| Now what happens if you want to get a list by using captures? You can use string.gmatch to do this.
| |
| <pre>
| |
| local s = "TwentyOne = 21 Two=2 One =7 Four= 4"
| |
| local Pattern = "(%a+)%s?=%s?(%d+)"
| |
| for key, val in string.gmatch(s, Pattern) do
| |
| print( key, val )
| |
| end
| |
| | |
| Output:
| |
| TwentyOne 21
| |
| Two 2
| |
| One 7
| |
| Four 4
| |
| </pre>
| |
| As you can see, string.gmatch iterated through all the matches and returned the captures key and val.
| |
| | |
| | |
| ==See also==
| |
| *[[Function_Dump/String_Manipulation|String Manipulation]] | |