PikaScript Library Reference: strings

bake

Syntax

'concrete' = bake('abstract', ['escape' = "{"], ['return' = "}"])

Description

Processes the 'abstract' string by interpreting any text bracketed by 'escape' and 'return' as PikaScript expressions and injecting the results from evaluating those expressions. The default brackets are '{' and '}'. The code is evaluated in the caller's frame. Thus you can inject local variables like this: '{myvar}'.

Examples

bake('The result of 3+7 is {3+7}') === 'The result of 3+7 is 10'
bake('Welcome back {username}. It has been {days} days since your last visit.')

<char>

Syntax

'character' = char(+code)

Description

Returns the character represented by +code as a string. +code is either an ASCII or Unicode value (depending on how PikaScript is configured). If +code is not a valid character code the exception 'Illegal character code: {code}' will be thrown.

Inverse: ordinal('character').

Examples

char(65) === 'A'
char(ordinal('å')) === 'å'

chop

Syntax

'chopped' = chop('string', +count)

Description

Removes the last +count number of characters from 'string'. This function is equivalent to 'string'{:length('string') - +count}. If +count is zero or negative, the entire 'string' is returned. If +count is greater than the length of 'string', the empty string is returned. (There is no function for removing characters from the beginning of the string because you can easily use 'string'{+count:}.)

Examples

chop('abcdefgh', 3) === 'abcde'
chop('abcdefgh', 42) === ''

<escape>

Syntax

'escaped' = escape('raw')

Description

Depending on the contents of the source string 'raw' it is encoded either in single (') or double (") quotes. If the string contains only printable ASCII chars (ASCII values between 32 and 126 inclusively) and no apostrophes ('), it is enclosed in single quotes with no further processing. Otherwise it is enclosed in double quotes (") and any unprintable ASCII character, backslash (\) or quotation mark (") is encoded using C-style escape sequences (e.g. "line1\nline2").

You can use unescape() to decode an escaped string.

Examples

escape("trivial") === "'trivial'"
escape("it's got an apostrophe") === '"it''s got an apostrophe"'
escape(unescape('"first line\n\xe2\x00tail"')) === '"first line\n\xe2\x00tail"'

<find>

Syntax

+offset = find('string', 'chars')

Description

Finds the first occurrence of any character of 'chars' in 'string' and returns the zero-based offset (i.e. 0 = first character). The search is case-sensitive. If no characters in 'chars' exist in 'string', the length of 'string' is returned. Use rfind() to find the last occurrence instead of the first. Use span() to find the first occurrence of any character not present in 'chars'. Use search() to find sub-strings instead of single characters.

Examples

find('abcd', 'd') == 3
find('abcdcba', 'dc') == 2
find('nomatch', 'x') == 7

<lower>

Syntax

'lowercase' = lower('string')

Description

Translates 'string' character by character to lower case. Notice that the standard implementation only works with characters having ASCII values between 32 and 126 inclusively.

Examples

lower('aBcD') === 'abcd'

<mismatch>

Syntax

+offset = mismatch('first', 'second')

Description

Compares the 'first' and 'second' strings character by character and returns the zero-based offset of the first mismatch (e.g. 0 = first character). If the strings are identical in contents, the returned value is the length of the shortest string. As usual, the comparison is case sensitive.

Examples

mismatch('abcd', 'abcd') == 4
mismatch('abc', 'abcd') == 3
mismatch('abCd', 'abcd') == 2

<ordinal>

Syntax

+code = ordinal('character')

Description

Returns the ordinal (i.e. the character code) of the single character string 'character'. Depending on how PikaScript is configured, the character code is an ASCII or Unicode value. If 'character' cannot be converted to a character code the exception 'Value is not single character: {character}' will be thrown.

Inverse: char(+code).

Examples

ordinal('A') == 65
ordinal(char(211)) == 211

<precision>

Syntax

'string' = precision(+value, +precision)

Description

Converts +value to a decimal number string (in scientific E notation if required). +precision is the maximum number of digits to include in the output. Scientific E notation (e.g. 1.3e+3) will be used if +precision is smaller than the number of digits required to express +value in decimal notation. The maximum number of characters returned is +precision plus 7 (for possible minus sign, decimal point and exponent).

Examples

precision(12345, 3) === '1.23e+4'
precision(9876, 8) === '9876'
precision(9876.54321, 8) === '9876.5432'
precision(-0.000000123456, 5) === '-1.2346e-7'
precision(+infinity, 1) === '+infinity'

<radix>

Syntax

'string' = radix(+value, +radix, [+minLength])

Description

Converts the integer +value to a string using a selectable radix between 2 (binary) and 16 (hexadecimal). If +minLength is specified and the string becomes shorter than this, it will be padded with leading zeroes. May throw 'Radix out of range: {radix}' or 'Minimum length out of range: {minLength}'.

Examples

radix(0xaa, 2, 12) === '000010101010'
radix(3735928559, 16) === 'deadbeef'
radix(0x2710, 10) === 10000

replace

Syntax

'processed' = replace('source', 'what', 'with', [>findFunction = search], [+dropCount = length(what)], [>replaceFunction = >$1])

Description

Replaces all occurrences of 'what' with 'with' in the 'source' string.

The optional >findFunction allows you to modify how the function finds occurrences of 'what' and +dropCount determines how many characters are replaced on each occurrence. The default >findFunction is ::search (and +dropCount is the number of characters in 'what'), which means that 'what' represents a substring to substitute. If you want this function to substitute any occurrence of any character in 'what', you can let >findFunction be ::find and +dropCount be 1. Similarly, you may use ::span to substitute occurrences of all characters not present in 'what'.

Finally, >replaceFunction lets you customize how substrings should be replaced. It will be called with two arguments, the source substring in $0 and 'with' in $1, and it is expected to return the replacement substring.

Examples

replace('Barbazoo', 'zoo', 'bright') === 'Barbabright'
replace('Barbalama', 'lm', 'p', find, 1) === 'Barbapapa'
replace('Bqaxrbzzabypeillme', 'Bbarel', '', span, 1) === 'Barbabelle'
replace('B03102020', '0123', 'abmr', find, 1, >$1{$0}) === 'Barbamama'

rfind

Syntax

+offset = rfind('string', 'chars')

Description

As find(), but finds the last occurrence of any character of 'chars' instead of the first. -1 is returned if no character was found (unlike find() which returns the length of 'string').

Examples

rfind('abcd', 'd') == 3
rfind('abcdcba', 'dc') == 4
rfind('nomatch', 'xyz') == -1

right

Syntax

'ending' = right('string', +count)

Description

Returns the last +count number of characters from 'string'. This function is equivalent to 'string'{length('string') - +count:}. If +count is greater than the length of 'string', the entire 'string' is returned. If +count is zero or negative, the empty string is returned. (There is no "left" function because you can easily use 'string'{:+count}.)

Examples

right('abcdefgh', 3) === 'fgh'
right('abcdefgh', 42) === 'abcdefgh'

rsearch

Syntax

+offset = rsearch('string', 'substring')

Description

As search(), but finds the last occurrence of 'substring' in 'string' instead of the first. A negative value is returned if 'substring' was not found (unlike search() which returns the length of 'string').

Examples

rsearch('abcdabcd', 'cd') == 6
rsearch('nomatch', 'xyz') == -3

rspan

Syntax

+offset = rspan('string', 'chars')

Description

As span(), but finds the last occurrence of a character not present in 'chars' instead of the first. -1 is returned if the entire 'string' consists of characters in 'chars (unlike span() which returns the length of 'string').

Examples

rspan('abcd', 'abc') == 3
rspan('abcdcba', 'ab') == 4
rspan('george bush', 'he bugs gore') == -1

<search>

Syntax

+offset = search('string', 'substring')

Description

Finds the first occurrence of 'substring' in 'string' and returns the zero-based offset (e.g. 0 = first character). The search is case-sensitive. If 'substring' does not exist in 'string', the length of 'string' is returned. Use rsearch() to find the last occurrence instead of the first. Use find() to find the first occurrence of any character in a set of characters instead of a sub-string.

Examples

search('abcdabcd', 'cd') == 2
search('nomatch', 'x') == 7

<span>

Syntax

+offset = span('string', 'chars')

Description

Finds the first occurrence of a character in 'string' that is not present in 'chars' and returns the zero-based offset (i.e. 0 = first character). The search is case-sensitive. If the entire 'string' consists of characters in 'chars', the length of 'string' is returned. Use rspan() to find the last occurrence instead of the first. Use find() to find the first occurrence of any character in 'chars'.

Examples

span('abcd', 'abc') == 3
span('abcdcba', 'ab') == 2
span('george bush', 'he bugs gore') == 11

tokenize

Syntax

tokenize('source', >processor, ['delimiters' = "\n"])

Description

Divides the 'source' string into tokens separated by any character in 'delimiters' (linefeed by default). For every extracted token, >processor is called, passing the token as the single argument $0 (not including the delimiter). The final delimiter at the end of the string is optional. For example, tokenize() can be useful for reading individual lines from a text file, parsing tab or comma-separated data and splitting sentences into separate words.

Examples

tokenize("First line\nSecond line\nLast line\n", >append(@lines, $0))
tokenize('Eeny, meeny, miny, moe', >print(trim($0)), ',')
tokenize('Data is not information, information is not knowledge, knowledge is not understanding, understanding is not!wisdom.', >if ($0 !== '') append(@words, $0), " \t\r\n,.!?&\"/;:=-()[]{}")

trim

Syntax

'trimmed' = trim('string', ['leading' = " \t\r\n"], ['trailing' = " \t\r\n"])

Description

Trims the source 'string' from leading and / or trailing characters of choice. The default characters are any white space character. If you pass void to 'leading' or 'trailing' you can prevent the routine from trimming leading respectively trailing characters.

Examples

trim("  extractme\t") === 'extractme'
trim("\n    keep trailing spaces  \n", , void) === "keep trailing spaces  \n"
trim("--- keep me ---", '-', '-') === ' keep me '

unescape

Syntax

'raw' = unescape('escaped')

Description

Converts a string that is either enclosed in single (') or double (") quotes. If the single (') quote is used, the string between the quotes is simply extracted "as is" with the exception of pairs of apostrophes ('') that are used to represent single apostrophes. If the string is enclosed in double quotes (") it can use a subset of the C-style escape sequences. The supported sequences are: \\ \" \' \a \b \f \n \r \t \v \xHH \uHHHH \<decimal>. If the string cannot be successfully converted an exception will be thrown.

Inverse: escape('raw').

Examples

unescape("'trivial'") == 'trivial'
unescape('"it''s got an apostrophe"') == "it's got an apostrophe"
unescape(escape("first line\n\xe2\x00tail")) == "first line\n\xe2\x00tail"

<upper>

Syntax

'uppercase' = upper('string')

Description

Translates 'string' character by character to upper case. Notice that the standard implementation only works with characters having ASCII values between 32 and 126 inclusively.

Examples

upper('aBcD') === 'ABCD'

wildfind

Syntax

+offset|void = wildfind('source', 'pattern', +from, +to, @captureQueue)

Description

This is a low-level subroutine used by wildmatch() to match the full or partial 'pattern' in 'source' between the offsets +from and +to (inclusively). The returned value is either the offset where the first match was found or void if no match was found. @captureQueue should be initialized with resetQueue() prior to calling this routine. "Captured ranges" will be pushed to this "queue" as pairs of offsets and lengths. Pop these with popFront().

See the documentation for wildmatch() for a description of the pattern syntax and more.

Examples

wildfind('abcdef', 'def', 0, 6, @c) == 3
wildfind('abcdef', '[def]', 0, 6, @c) == 5
wildfind('abcdef', '[def]*', 0, 6, @c) == 3
wildfind('abcdef', '[^def]', 4, 6, @c) == void

wildmatch

Syntax

?matched = wildmatch('source', 'pattern', [@captures, ...])

Description

Tries to match the 'source' string with 'pattern' (which may contain "wild card" patterns). true is returned if there is a match. You may also capture substrings from 'source' into the @captures variables. The pattern syntax is inspired by the "glob" standard (i.e. the syntax used for matching file names in most operating systems). However, a lot of additional features have been added, making the complexity of the syntax somewhere between glob and "regular expressions". It is easiest to describe with some examples:

*           any string (including the empty string)
?           a single arbitrary character
~           an optional arbitrary character
smurf       the string 'smurf' exactly (comparison is always case sensitive)
*smurf*     'smurf' anywhere in the source
????~~~~    between four and eight arbitrary characters
[a-zA-Z]    any single lower or upper case letter between 'a' and 'z'
[^a-zA-Z]   any single character that is not between 'a' and 'z' (case insensitive)
[*]         matches a single asterisk
[^]         a single ^ character only
[[]]        [ or ]
[]^]        ] or ^
[x-]        x or -
[0-9]*      a string consisting of zero or more digits
[0-9]????   exactly four digits
[0-9]?*     a string consisting of one or more digits
[0-9]??~~   between two and four digits
[0-9]?[]*   a single digit and then an arbitrary string
{*}smurf    captures everything before 'smurf' into the next @captures variable

Notice that the * and ~ quantifiers are always non-greedy (i.e. they match as little as they possibly can). (This is a limitation of the current implementation, there are plans to let double ** mark a greedy match instead.) If you want to perform case insensitive matching for the entire pattern, use lower() or upper() on the source string. There is also a low-level routine called wildfind() if you need greater control over the matching.

Examples

wildmatch('readme.txt', '*.txt')
wildmatch('myfile.with.extension', '{[^<>:"/\|?*]*}.{[^<>:"/\|?*.]*}', @filename, @extension) && filename === 'myfile.with' && extension === 'extension'
wildmatch(LF # "skip line\n\n\tmatch : me \nthis:is the rest" # LF, "*\n[ \t]*{[^ \t]?*}[ \t]*:*{[^ \t]?[]*}[ \t]*\n{*}", @key, @value, @theRest) && key === 'match' && value === 'me'

strings

Syntax

Description

Examples

See Also

Syntax

Description

Examples

See Also

Syntax

Description

Examples

See Also

Syntax

Description

Examples

See Also

Syntax

Description

Examples

See Also

Syntax

Description

Examples

Syntax

Description

Examples

See Also

Syntax

Description

Examples

See Also

Syntax

Description

Examples

See Also

Syntax

Description

Examples

See Also

Syntax

Description

Examples

See Also

Syntax

Description

Examples

Syntax

Description

Examples

See Also

Syntax

Description

Examples

Syntax

Description

Examples

See Also

Syntax

Description

Examples

See Also

Syntax

Description

Examples

See Also

Syntax

Description

Examples

See Also

Syntax

Description

Examples

See Also

Syntax

Description

Examples

See Also

Syntax

Description