3. Data types
In this chapter we will discuss all the different ways to store data in Pike in detail. We have seen examples of many of these, but we haven't really gone into how they work. In this chapter we will also see which operators and functions work with the different types.
Types in Pike are used in two different contexts; during compile-time, and during run-time. Some types are only used during compile-time (void, mixed and all constructed types), all other types are also used during run-time. Also note the following functions and special forms:
- type
typeof
(mixed x) - This special form returns the compile-time type for the expression x (which is not evaluated). Ie the type that the compiler believes that the expression will return if evaluated.
- type
_typeof
(mixed x) - This function returns the run-type of the value x (which is evaluated).
There are two categories of run-time data types in Pike: basic types, and pointer types. The difference is that basic types are copied when assigned to a variable. With pointer types, merely the pointer is copied, that way you get two variables pointing to the same thing.
3.1. Basic types
The basic types are int, float and string. For you who are accustomed to C or C++, it may seem odd that a string is a basic type as opposed to an array of char, but it is surprisingly easy to get used to.
3.1.1. int
Int is short for integer, or integer number. They are normally 32 bit integers, which means that they are in the range -2147483648 to 2147483647. (Note that on some machines an int might be larger than 32 bits.) If Pike is compiled with bignum support the 32 bit limitation does not apply and thus the integers can be of arbitrary size. Since they are integers, no decimals are allowed. An integer constant can be written in several ways:
Pattern | Example | Description |
-?[1-9][0-9]* | 78 | Decimal number |
-?0[0-9]* | 0116 | Octal number |
-?0[xX][0-9a-fA-F]+ | 0x4e | Hexadecimal number |
-?0[bB][01]+ | 0b1001110 | Binary number |
-?'\\?.' | 'N' | ASCII character |
All of the above represent the number 78. Octal notation means that
each digit is worth 8 times as much as the one after. Hexadecimal notation
means that each digit is worth 16 times as much as the one after.
Hexadecimal notation uses the letters a, b, c, d, e and f to represent the
numbers 10, 11, 12, 13, 14 and 15. In binary notation every digit is worth
twice the value of the succeding digit, but only 1:s and 0:s are used. The
ASCII notation gives the ASCII value of the character between the single
quotes. In this case the character is N which just happens to be
78 in ASCII. Some characters, like special characters as newlines, can not
be placed within single quotes. The special generation sequence for those
characters, listed under strings, must be used instead. Specifically this
applies to the single quote character itself, which has to be written as
'\''
.
When pike is compiled with bignum support integers in never overflow or underflow when they reach the system-defined maxint/minint. Instead they are silently converted into bignums. Integers are usually implemented as 2-complement 32-bits integers, and thus are limited within -2147483648 and 2147483647. This may however vary between platforms, especially 64-bit platforms. FIXME: Conversion back to normal integer?
All the arithmetic, bitwise and comparison operators can be used on integers. Also note these functions:
- int
intp
(mixed x) - This function returns 1 if x is an int, 0 otherwise.
- int
random
(int x) - This function returns a random number greater or equal to zero and smaller than x.
- int
reverse
(int x) - This function reverses the order of the bits in x and returns the new number. It is not very useful.
- int
sqrt
(int x) - This computes the square root of x. The value is always rounded down.
3.1.2. float
Although most programs only use integers, they are unpractical when doing
trigonometric calculations, transformations or anything else where you
need decimals. For this purpose you use float
. Floats are
normally 32 bit floating point numbers, which means that they can represent
very large and very small numbers, but only with 9 accurate digits. To write
a floating point constant, you just put in the decimals or write it in the
exponential form:
Pattern | Example | Equals |
-?[0-9]*\.[0-9]+ | 3.1415926 | 3.1415926 |
-?[0-9]+e-?[0-9]+ | -5e3 | -5000.0 |
-?[0-9]*\.[0-9]+e-?[0-9]+ | .22e-2 | 0.0022 |
Of course you can have any number of decimals to increase the accuracy.
Usually digits after the ninth digit are ignored, but on some architectures
float
might have higher accuracy than that. In the exponential
form, e
means "times 10 to the power of", so 1.0e9
is equal to "1.0 times 10 to the power of 9". FIXME: float and int is not
compatible and no implicit cast like in C++
All the arithmetic and comparison operators can be used on floats. Also, these functions operates on floats:
- trigonometric functions
- The trigonometric functions are:
sin
,asin
,cos
,acos
,tan
andatan
. If you do not know what these functions do you probably don't need them. Asin, acos and atan are of course short for arc sine, arc cosine and arc tangent. On a calculator they are often known as inverse sine, inverse cosine and inverse tangent. - float
log
(float x) - This function computes the natural logarithm of x,
- float
exp
(float x) - This function computes e raised to the power of x.
- float
pow
(float|int x, float|int y) - This function computes x raised to the power of y.
- float
sqrt
(float x) - This computes the square root of x.
- float
floor
(float x) - This function computes the largest integer value less than or equal to x. Note that the value is returned as a float, not an int.
- float
ceil
(float x) - This function computes the smallest integer value greater than or equal to x and returns it as a float.
- float
round
(float x) - This function computes the closest integer value to x and returns it as a float.
3.1.3. string
A string can be seen as an array of values from 0 to 2³²-1. Usually a string contains text such as a word, a sentence, a page or even a whole book. But it can also contain parts of a binary file, compressed data or other binary data. Strings in Pike are shared, which means that identical strings share the same memory space. This reduces memory usage very much for most applications and also speeds up string comparisons. We have already seen how to write a constant string:
"hello world" // hello world "he" "llo" // hello "\116" // N (116 is the octal ASCII value for N) "\t" // A tab character "\n" // A newline character "\r" // A carriage return character "\b" // A backspace character "\0" // A null character "\"" // A double quote character "\\" // A singe backslash "\x4e" // N (4e is the hexadecimal ASCII value for N) "\d78" // N (78 is the decimal ACII value for N) "hello world\116\t\n\r\b\0\"\\" // All of the above "\xff" // the character 255 "\xffff" // the character 65536 "\xffffff" // the character 16777215 "\116""3" // 'N' followed by a '3'
Pattern | Example |
. | N |
\\[0-7]+ | \116 |
\\x[0-9a-fA-F]+ | \x4e |
\\d[0-9]+ | \d78 |
\\u[0-9a-fA-F]+ (4) | \u004E |
\\U[0-9a-fA-F]+ (8) | \U0000004e |
Sequence | ASCII code | Charcter |
\a | 7 | An acknowledge character |
\b | 8 | A backspace character |
\t | 9 | A tab character |
\n | 10 | A newline character |
\v | 11 | A vertical tab character |
\f | 12 | A form feed character |
\r | 13 | A carriage return character |
\" | 34 | A double quote character |
\\ | 92 | A backslash character |
As you can see, any sequence of characters within double quotes is a string. The backslash character is used to escape characters that are not allowed or impossible to type. As you can see, \t is the sequence to produce a tab character, \\ is used when you want one backslash and \" is used when you want a double quote (") to be a part of the string instead of ending it. Also, \XXX where XXX is an octal number from 0 to 37777777777 or \xXX where XX is 0 to ffffffff lets you write any character you want in the string, even null characters. From version 0.6.105, you may also use \dXXX where XXX is 0 to 2³²-1. If you write two constant strings after each other, they will be concatenated into one string.
You might be surprised to see that individual characters can have values up to 2³²-1 and wonder how much memory that use. Do not worry, Pike automatically decides the proper amount of memory for a string, so all strings with character values in the range 0-255 will be stored with one byte per character. You should also beware that not all functions can handle strings which are not stored as one byte per character, so there are some limits to when this feature can be used.
Although strings are a form of arrays, they are immutable. This means that there is no way to change an individual character within a string without creating a new string. This may seem strange, but keep in mind that strings are shared, so if you would change a character in the string "foo", you would change *all* "foo" everywhere in the program.
However, the Pike compiler will allow you to to write code like you could change characters within strings, the following code is valid and works:
string s="hello torld"; s[6]='w';
However, you should be aware that this does in fact create a new string and
it may need to copy the string s to do so. This means that the above
operation can be quite slow for large strings. You have been warned.
Most of the time, you can use replace
, sscanf
,
`/
or some other high-level string operation to avoid having to use the above
construction too much.
All the comparison operators plus the operators listed here can be used on strings:
- Summation
- Adding strings together will simply concatenate them. "foo"+"bar" becomes "foobar".
- Subtraction
- Subtracting one string from another will remove all occurrences of the second string from the first one. So "foobarfoogazonk" - "foo" results in "bargazonk".
- Indexing
- Indexing will let you get the ASCII value of any character in a string. The first index is zero.
- Range
- The range operator will let you copy any part of the string into a new string. Example: "foobar"[2..4] will return "oba".
- Division
- Division will let you divide a string at every occurrence of a word or character. For instance if you do "foobargazonk" / "o" the result would be ({"f","","bargaz","nk"}). It is also possible to divide the string into strings of length N by dividing the string by N. If N is converted to a float before dividing, the reminder of the division will be included in the result.
- Multiplication
- The inverse of the division operator can be accomplished by multiplying an array with a string. So if you evaluate ({"f","","bargaz","nk"}) * "o" the result would be "foobargazonk".
- Modulo
- To complement the division operator, you can do string % int. This operator will simply return the part of the string that was not included in the array returned by string / int
Also, these functions operates on strings:
- string
String.capitalize
(string s) - Returns s with the first character converted to upper case.
- int
String.count
(string haystack, string needle) - Returns the number of occurances of needle in haystack.
Equivalent to
sizeof
(haystack/needle)-1. - int
String.width
(string s) - Returns the width s in bits (8, 16 or 32).
- string
lower_case
(string s) - Returns s with all the upper case characters converted to lower case.
- string
replace
(string s, string from, string to) - This function replaces all occurrences of the string from in s with to and returns the new string.
- string
reverse
(string s) - This function returns a copy of s with the last byte from s first, the second last in second place and so on.
- int
search
(string haystack, string needle) - This function finds the first occurrence of needle in haystack and returns where it found it.
- string
sizeof
(string s) - Same as
strlen
(s), returns the length of the string. - int
stringp
(mixed s) - This function returns 1 if s is a string, 0 otherwise.
- int
strlen
(string s) - Returns the length of the string s.
- string
upper_case
(string s) - This function returns s with all lower case characters converted to upper case.
3.2. Pointer types
The basic types are, as the name implies, very basic. They are the foundation, most of the pointer types are merely interesting ways to store the basic types. The pointer types are array, mapping, multiset, program, object and function. They are all pointers which means that they point to something in memory. This "something" is freed when there are no more pointers to it. Assigning a variable with a value of a pointer type will not copy this "something" instead it will only generate a new reference to it. Special care sometimes has to be taken when giving one of these types as arguments to a function; the function can in fact modify the "something". If this effect is not wanted you have to explicitly copy the value. More about this will be explained later in this chapter.
3.2.1. array
Arrays are the simplest of the pointer types. An array is merely a block of memory with a fixed size containing a number of slots which can hold any type of value. These slots are called elements and are accessible through the index operator. To write a constant array you enclose the values you want in the array with ({ }) like this:
({ }) // Empty array ({ 1 }) // Array containing one element of type int ({ "" }) // Array containing a string ({ "", 1, 3.0 }) // Array of three elements, each of different type
As you can see, each element in the array can contain any type of value. Indexing and ranges on arrays works just like on strings, except with arrays you can change values inside the array with the index operator. However, there is no way to change the size of the array, so if you want to append values to the end you still have to add it to another array which creates a new array. Figure 4.1 shows how the schematics of an array. As you can see, it is a very simple memory structure.
Operators and functions usable with arrays:
- indexing ( arr [ c ] )
- Indexing an array retrieves or sets a given element in the array. The index c has to be an integer. To set an index, simply put the whole thing on the left side of an assignment, like this: arr [ c ] = new_value
- range ( arr [ from .. to ] )
- The range copies the elements from, from+1, , from+2 ... to into a new array. The new array will have the size to-from+1.
- comparing (a == b and a != b)
- The equal operator returns 1 if a and b are the same arrays. It is not enough that they have the same size and same data. They must be the same array. For example: ({1}) == ({1}) would return 0, while array(int) a=({1}); return a==a; would return 1. Note that you cannot use the operators >, >=, < or <= on arrays.
- Summation (a + b)
- As with strings, summation concatenates arrays. ({1})+({2}) returns ({1,2}).
- Subtractions (a - b)
- Subtracting one array from another returns a copy of a with all the elements that are also present in b removed. So ({1,3,8,3,2}) - ({3,1}) returns ({8,2}).
- Intersection (a & b)
- Intersection returns an array with all values that are present in both a and b. The order of the elements will be the same as the the order of the elements in a. Example: ({1,3,7,9,11,12}) & ({4,11,8,9,1}) will return: ({1,9,11}).
- Union (a | b)
- Union works almost as summation, but it only adds elements not already present in a. So, ({1,2,3}) | ({1,3,5}) will return ({1,2,3,5}). Note: the order of the elements in a can be changed!
- Xor (a ^ b)
- This is also called symmetric difference. It returns an array with all elements present in a or b but the element must NOT be present in both. Example: ({1,3,5,6}) ^ ({4,5,6,7}) will return ({1,3,4,7}).
- Division (a / b)
- This will split the array a into an array of arrays. If b is another array, a will be split at each occurance of that array. If b is an integer or float, a will be split between every bth element. Examples: ({1,2,3,4,5})/({2,3}) will return ({ ({1}), ({4,5}) }) and ({1,2,3,4})/2 will return ({ ({1,2}), ({3,4}) }).
- Modulo (a % b)
- This operation is valid only if b is an integer. It will return the part of the array that was not included by dividing a by b.
- array
aggregate
(mixed ... elems) - This function does the same as the ({ }) operator; it creates an array from all arguments given to it. In fact, writing ({1,2,3}) is the same as writing aggregate(1,2,3).
- array
allocate
(int size) - This function allocates a new array of size size. All the elements in the new array will be zeroes.
- int
arrayp
(mixed a) - This function returns 1 if a is an array, 0 otherwise.
- array
column
(array(mixed) a, mixed ind) - This function goes through the array a and indexes every element in it on ind and builds an array of the results. So if you have an array a in which each element is a also an array. This function will take a cross section, by picking out element ind from each of the arrays in a. Example: column( ({ ({1,2,3}), ({4,5,6}), ({7,8,9}) }), 2) will return ({3,6,9}).
- int
equal
(mixed a, mixed b) - This function returns 1 if if a and b look the same. They do not have to be pointers to the same array, as long as they are the same size and contain equal data.
- array
filter
(array a, mixed func, mixed ... args) - filter returns every element in a for which
func returns true when called with that element as
first argument, and args for the second, third, etc.
arguments. (Both a and func can be other things; see
the reference for
filter
for details about that.) - array
map
(array a, mixed func, mixed ... args) - This function works similar to
filter
but returns the results of the function func instead of returning the elements from a for which func returns true. (Likefilter
, this function accepts other things for a and func; see the reference formap
.) - array
replace
(array a, mixed from, mixed to) - This function will create a copy of a with all elements equal to from replaced by to.
- array
reverse
(array a) - Reverse will create a copy of a with the last element first, the last but one second, and so on.
- array
rows
(array a, array indexes) - This function is similar to
column
. It indexes a with each element from indexes and returns the results in an array. For example: rows( ({"a","b","c"}), ({ 2,1,2,0}) ) will return ({"c","b","c","a"}). - int
search
(array haystack, mixed needle) - This function returns the index of the first occurrence of an element equal (tested with ==) to needle in the array haystack.
- int
sizeof
(mixed arr) - This function returns the number of elements in the array arr.
- array
sort
(array arr, array ... rest) - This function sorts arr in smaller-to-larger order. Numbers, floats and strings can be sorted. If there are any additional arguments, they will be permutated in the same manner as arr. See functions for more details.
- array
Array.uniq
(array a) - This function returns a copy of the array a with all duplicate elements removed. Note that this function can return the elements in any order.
3.2.2. mapping
Mappings are are really just more generic arrays. However, they are slower and use more memory than arrays, so they cannot replace arrays completely. What makes mappings special is that they can be indexed on other things than integers. We can imagine that a mapping looks like this:
Each index-value pair is floating around freely inside the mapping. There is exactly one value for each index. We also have a (magical) lookup function. This lookup function can find any index in the mapping very quickly. Now, if the mapping is called m and we index it like this: m [ i ] the lookup function will quickly find the index i in the mapping and return the corresponding value. If the index is not found, zero is returned instead. If we on the other hand assign an index in the mapping the value will instead be overwritten with the new value. If the index is not found when assigning, a new index-value pair will be added to the mapping. Writing a constant mapping is easy:
([ ]) // Empty mapping ([ 1:2 ]) // Mapping with one index-value pair, the 1 is the index ([ "one":1, "two":2 ]) // Mapping which maps words to numbers ([ 1:({2.0}), "":([]), ]) // Mapping with lots of different types
As with arrays, mappings can contain any type. The main difference is that the index can be any type too. Also note that the index-value pairs in a mapping are not stored in a specific order. You can not refer to the fourteenth key-index pair, since there is no way of telling which one is the fourteenth. Because of this, you cannot use the range operator on mappings.
The following operators and functions are important:
- indexing ( m [ ind ] )
- As discussed above, indexing is used to retrieve, store and add values to the mapping.
- addition, subtraction, union, intersection and xor
- All these operators works exactly as on arrays, with the difference that
they operate on the indices. In those cases when the value can come from
either mapping, it will be taken from the right side of the operator.
This makes it easier to add new values to a mapping with +=.
Some examples:
([1:3, 3:1]) + ([2:5, 3:7]) returns ([1:3, 2:5, 3:7 ])
([1:3, 3:1]) - ([2:5, 3:7]) returns ([1:3])
([1:3, 3:1]) | ([2:5, 3:7]) returns ([1:3, 2:5, 3:7 ])
([1:3, 3:1]) & ([2:5, 3:7]) returns ([3:7])
([1:3, 3:1]) ^ ([2:5, 3:7]) returns ([1:3, 2:5]) - same ( a == b )
- Returns 1 if a is the same mapping as b, 0 otherwise.
- not same ( a != b )
- Returns 0 if a is the same mapping as b, 1 otherwise.
- array
indices
(mapping m) - Indices returns an array containing all the indices in the mapping m.
- mixed
m_delete
(mapping m, mixed ind) - This function removes the index-value pair with the index ind from the mapping m. It will return the value that was removed.
- int
mappingp
(mixed m) - This function returns 1 if m is a mapping, 0 otherwise.
- mapping
mkmapping
(array ind, array val) - This function constructs a mapping from the two arrays ind and val. Element 0 in ind and element 0 in val becomes one index-value pair. Element 1 in ind and element 1 in val becomes another index-value pair, and so on..
- mapping
replace
(mapping m, mixed from, mixed to) - This function creates a copy of the mapping m with all values equal to from replaced by to.
- mixed
search
(mapping m, mixed val) - This function returns the index of the 'first' index-value pair which has the value val.
- int
sizeof
(mapping m) - Sizeof returns how many index-value pairs there are in the mapping.
- array
values
(mapping m) - This function does the same as
indices
, but returns an array with all the values instead. Ifindices
andvalues
are called on the same mapping after each other, without any other mapping operations in between, the returned arrays will be in the same order. They can in turn be used as arguments tomkmapping
to rebuild the mapping m again. - int
zero_type
(mixed t) - When indexing a mapping and the index is not found, zero is returned. However, problems can arise
if you have also stored zeroes in the mapping. This function allows you to see the difference between
the two cases. If zero_type(m [ ind ]) returns 1, it means that the value was
not present in the mapping. If the value was present in the mapping,
zero_type
will return something else than 1.
3.2.3. multiset
A multiset is almost the same thing as a mapping. The difference is that there are no values:
Instead, the index operator will return 1 if the value was found in the multiset and 0 if it was not. When assigning an index to a multiset like this: mset[ ind ] = val the index ind will be added to the multiset mset if val is true. Otherwise ind will be removed from the multiset instead.
Writing a constant multiset is similar to writing an array:
(< >) // Empty multiset (< 17 >) // Multiset with one index: 17 (< "", 1, 3.0, 1 >) // Multiset with four indices
Note that you can actually have more than one of the same index in a multiset. This is normally not used, but can be practical at times.
3.2.4. program
Normally, when we say program we mean something we can execute from a shell prompt. However, Pike has another meaning for the same word. In Pike a program is the same as a class in C++. A program holds a table of what functions and variables are defined in that program. It also holds the code itself, debug information and references to other programs in the form of inherits. A program does not hold space to store any data however. All the information in a program is gathered when a file or string is run through the Pike compiler. The variable space needed to execute the code in the program is stored in an object which is the next data type we will discuss.
Writing a program is easy, in fact, every example we have tried so far has been a program. To load such a program into memory, we can use compile_file which takes a file name, compiles the file and returns the compiled program. It could look something like this:
program p = compile_file("hello_world.pike");
You can also use the cast operator like this:
program p = (program) "hello_world";
This will also load the program hello_world.pike, the only difference is that it will cache the result so that next time you do (program)"hello_world" you will receive the _same_ program. If you call compile_file("hello_world.pike") repeatedly you will get a new program each time.
There is also a way to write programs inside programs with the help of the class keyword:
class class_name { inherits, variables and functions }
The class keyword can be written as a separate entity outside of all functions, but it is also an expression which returns the program written between the brackets. The class_name is optional. If used you can later refer to that program by the name class_name. This is very similar to how classes are written in C++ and can be used in much the same way. It can also be used to create structs (or records if you program Pascal). Let's look at an example:
class record { string title; string artist; array(string) songs; } array(record) records = ({}); void add_empty_record() { records+=({ record() }); } void show_record(record rec) { write("Record name: "+rec->title+"\n"); write("Artist: "+rec->artist+"\n"); write("Songs:\n"); foreach(rec->songs, string song) write(" "+song+"\n"); }
This could be a small part of a better record register program. It is not a complete executable program in itself. In this example we create a program called record which has three identifiers. In add_empty_record a new object is created by calling record. This is called cloning and it allocates space to store the variables defined in the class record. Show_record takes one of the records created in add_empty_record and shows the contents of it. As you can see, the arrow operator is used to access the data allocated in add_empty_record. If you do not understand this section I suggest you go on and read the next section about objects and then come back and read this section again.
- cloning
- To create a data area for a program you need to instantiate or clone the program. This is accomplished by using a pointer to the program as if it was a function and call it. That creates a new object and calls the function create in the new object with the arguments.
- compiling
- All programs are generated by compiling a string. The string may of
course be read from a file. For this purpose there are three functions:
program
compile
(string p); programcompile_file
(string filename); programcompile_string
(string p, string filename);compile_file
simply reads the file given as argument, compiles it and returns the resulting program.compile_string
instead compiles whatever is in the string p. The second argument, filename, is only used in debug printouts when an error occurs in the newly made program. Bothcompile_file
andcompile_string
callcompile
to actually compile the string after having calledcpp
on it. - casting
- Another way of compiling files to program is to use the cast operator. Casting a string to the type program calls a function in the master object which will compile the program in question for you. The master also keeps the program in a cache, so if you later need the same program again it will not be re-compiled.
- int
programp
(mixed p) - This function returns 1 if p is a program, 0 otherwise.
- comparisons
- As with all data types == and != can be used to see if two programs are the same or not.
The following operators and functions are important:
- cloning ( p ( args ) )
- Creates an object from a program. Discussed in the next section.
- indexing ( p [ string ], or p -> identifier )
- Retreives the value of the named constant from a program.
- array(string)
indices
(program p) - Returns an array with the names of all non-protected constants in the program.
- array(mixed)
values
(program p) - Returns an array with the values of all non-protected constants in the program.
3.2.5. object
Although programs are absolutely necessary for any application you might want to write, they are not enough. A program doesn't have anywhere to store data, it just merely outlines how to store data. To actually store the data you need an object. Objects are basically a chunk of memory with a reference to the program from which it was cloned. Many objects can be made from one program. The program outlines where in the object different variables are stored.
Each object has its own set of variables, and when calling a function in that object, that function will operate on those variables. If we take a look at the short example in the section about programs, we see that it would be better to write it like this:
class record { string title; string artist; array(string) songs; void show() { write("Record name: "+title+"\n"); write("Artist: "+artist+"\n"); write("Songs:\n"); foreach(songs, string song) write(" "+song+"\n"); } } array(record) records = ({}); void add_empty_record() { records+=({ record() }); } void show_record(object rec) { rec->show(); }
Here we can clearly see how the function show prints the contents of the variables in that object. In essence, instead of accessing the data in the object with the -> operator, we call a function in the object and have it write the information itself. This type of programming is very flexible, since we can later change how record stores its data, but we do not have to change anything outside of the record program.
Functions and operators relevant to objects:
- indexing
- Objects can be indexed on strings to access identifiers. If the identifier is a variable, the value can also be set using indexing. If the identifier is a function, a pointer to that function will be returned. If the identifier is a constant, the value of that constant will be returned. Note that the -> operator is actually the same as indexing. This means that o->foo is the same as o["foo"]
- cloning
- As discussed in the section about programs, cloning a program is done by using a pointer to the program as a function and calling it. Whenever you clone an object, all the global variables will be initialized. After that the function create will be called with any arguments you call the program with.
- void
destruct
(object o) - This function invalidates all references to the object o and frees all variables in that object. This function is also called when o runs out of references. If there is a function named destroy in the object, it will be called before the actual destruction of the object.
- array(string)
indices
(object o) - This function returns a list of all identifiers in the object o.
- program
object_program
(object o) - This function returns the program from which o was cloned.
- int
objectp
(mixed o) - This function returns 1 if o is an object, 0 otherwise. Note that if o has been destructed, this function will return 0.
- object
this_object
() - This function returns the object in which the interpreter is currently executing.
- array
values
(object o) - This function returns the same as rows(o,indices(o)). That means it returns all the values of the identifiers in the object o.
- comparing
- As with all data types == and != can be used to check if two objects are the same or not.
3.2.6. function
When indexing an object on a string, and that string is the name of a function in the object a function is returned. Despite its name, a function is really a function pointer.
When the function pointer is called, the interpreter sets
this_object()
to the object in which the function is located and
proceeds to execute the function it points to. Also note that function pointers
can be passed around just like any other data type:
int foo() { return 1; } function bar() { return foo; } int gazonk() { return foo(); } int teleledningsanka() { return bar()(); }
In this example, the function bar returns a pointer to the function foo. No indexing is necessary since the function foo is located in the same object. The function gazonk simply calls foo. However, note that the word foo in that function is an expression returning a function pointer that is then called. To further illustrate this, foo has been replaced by bar() in the function teleledningsanka.
For convenience, there is also a simple way to write a function inside another function. To do this you use the lambda keyword. The syntax is the same as for a normal function, except you write lambda instead of the function name:
lambda ( types ) { statements }
The major difference is that this is an expression that can be used inside an other function. Example:
function bar() { return lambda() { return 1; }; )
This is the same as the first two lines in the previous example, the keyword lambda allows you to write the function inside bar.
Note that unlike C++ and Java you can not use function overloading in Pike. This means that you cannot have one function called 'foo' which takes an integer argument and another function 'foo' which takes a float argument.
This is what you can do with a function pointer.
- calling ( f ( mixed ... args ) )
- As mentioned earlier, all function pointers can be called. In this example the function f is called with the arguments args.
- string
function_name
(function f) - This function returns the name of the function f is pointing at.
- object
function_object
(function f) - This function returns the object the function f is located in.
- int
functionp
(mixed f) - This function returns 1 if f is a function, 0 otherwise. If f is located in a destructed object, 0 is returned.
- function
this_function
() - This function returns a pointer to the function it is called from. This is normally only used with lambda functions because they do not have a name.
3.3. Compile-time types
There are two types that are pure compile-time types:
3.3.1. void
The type void is used to indicate the absence or optionality of a value. There are two typical use cases:
- As the return type of a function (eg void foo();).
- This indicates that the function does not return any value.
- As one of the types in a type set for a function parameter (eg int foo(int|void param)).
- This indicates that the caller of the function may omit that parameter
when calling the function, in which case it will default to the special
value
UNDEFINED
.
When creating functions with optional parameters the following functions may be of interest:
- int
undefinedp
(mixed x) - This function returns 1 if x is
UNDEFINED
, and 0 otherwise. - int
query_num_arg
() - This function returns the number of arguments that the calling function got called with.
3.3.2. mixed
The type mixed is used to indicate that values of any type may be passed here, and that the actual type of the values that will be used at run-time is totally unknown.
This type is typically used when implementing container classes (where the actual values won't be manipulated by the code in the class), or as a convenience fall-back when the actual compile-time type is getting too complicated.
3.3.3. __unknown__
The type __unknown__ is used to indicate that nothing is known about the type of the value. It is the inverse of mixed|void.
It is most commonly used as the type for the content of empty container types (like eg: ({}) (array(__unknown__)) or ([]) (mapping(__unknown__:__unknown__))), or as the type for the many field in a callback function: (eg: function(int, __unknown__...:int)).
Note this type is new in Pike 9.0. In Pike 8.0 and earlier mixed was used for this in most contexts.
3.3.4. Constructed types
Futhermore more specific compile-time types may be constructed by either subtyping the basic types by specifying parameters (eg function(string(7bit), int(0..): string(7bit)) instead of just plain function), or by using the type union operator (`|) to specify several alternative types (eg int|float instead of mixed). Note that the run-time type may differ from the declared compile-time type (like eg function(string, int: string(7bit)) and int(17..17) respectively).
3.3.5. Generic types
To improve the type-safety over just using mixed, it is possible to define place-holder types that are replaced with actual types when the class is used.
class Container (<T>) (T|void content) {} Container(<int>) int_container = Container(<int>)(17); Container(<float>) float_container = Container(<float>)(17.0); Container mixed_container = Container("foo");
The above class is similar to:
class Container(mixed|void content) {}
But the former allows the compiler to eg check that only values of the expected types are put in int_container and float_container.
3.4. Modifiers
Modifiers are keywords that may be specified before the type of a symbol declaration or inherit. They typically affect the symbol lookup or other related compiler behavior.
3.4.1. protected
The protected modifier hides symbols from external
indexing (ie they are still accessable to subclassess that
have inherited the class, but not via predef::`->()
or predef::`[]()
).
In ancient versions of Pike this modifier was known as static.
3.4.2. local
The local modifier causes use of the symbol from the current class to not be affected by overloading by subclasses.
This modifier is also available with the name inline.
The modifier final is similar.
3.4.3. private
The private modifier hides symbols from internal indexing (ie they are not accessable to subclasses) and implies protected and local.
3.4.4. final
The final modifier causes the compiler to issue an error if a subclass attempts to overload the symbol.
The modifier local is similar, but more permissive.
Note in ancient versions of Pike this modifier was also available with the name nomask.
3.4.5. optional
The optional modifier causes the type checker to consider the symbol as optional to implement to satisfy the API (albeit if the symbol exists it still must comply with the type).
3.4.6. extern
The extern modifier indicates that a symbol may be implemented by a subclass, but does not define it in the current class. It implies optional.
3.4.7. public
The public modifier causes inherited private symbols to become localprotected (and thus available to both the inheriting class and subsequent inherits, albeit not overrideable).
Note that the public modifier is only useful for inherit statements. In all other cases it is essentially a no-op.
3.4.8. variant
The variant modifier is used to provide alternative APIs for functions. The different functions will be called depending on what the arguments are when the symbol is called.
3.4.9. __weak__
The __weak__ modifier is used to indicate to the garbage collector that it may clear the variable if it is the only holder of the value.
Note that this modifier is new in Pike 9.0.
3.4.10. __unused__
The __unused__ modifier is used to inhibit the warning that the symbol is not used.
Note that this modifier is new in Pike 9.0.
3.4.11. __generator__
The __generator__ modifier converts a function into a function that returns a restartable function.
__generator__ int counter(int start, int stop) { while (start < stop) { continue return start++; } return stop; }
The above behaves similar to:
function(:int) counter(int start, int stop) { return lambda() { if (start <= stop) { return start++; } return UNDEFINED; }; }
Note that this modifier is new in Pike 9.0.
3.4.12. __async__
The __async__ modifier converts a function
into an asynchronous function. For such functions an
implicit Concurrent.Promise
object is allocated,
and all returns and yields are converted into setting
the promise followed by returning UNDEFINED,
except for the first return or yield which will return the
Concurrent.Future
corresponding to the promise.
__async__ int foo(mixed ... args) { return 17; }
The above behaves similar to:
Concurrent.Future(<int>) foo(mixed ... args) { Concurrent.Promise(<int>) __async_promise__ = Concurrent.Promise(<int>)(); __generator__ lambda() { __async_promise__->failure(catch { __async_promise__->success(17); return UNDEFINED; }); return UNDEFINED; }()(); return __async_promise__->future(); }
Restartable functions are required in order to be able
to use predef::await()
.
Note that this modifier is new in Pike 9.0.
3.4.13. static
The static modifier is currently identical to the protected modifier except for warning that it is being used (as of Pike 9.0).
Note: In a future version of Pike this may change. Do not use.
3.5. Sharing data
As mentioned in the beginning of this chapter, the assignment operator (=) does not copy anything when you use it on a pointer type. Instead it just creates another reference to the memory object. In most situations this does not present a problem, and it speeds up Pike's performance. However, you must be aware of this when programming. This can be illustrated with an example:
int main(int argc, array(string) argv) { array(string) tmp; tmp=argv; argv[0]="Hello world.\n"; write(tmp[0]); }
This program will of course write Hello world.
Sometimes you want to create a copy of a mapping, array or object. To
do so you simply call copy_value
with whatever you want to copy
as argument. Copy_value is recursive, which means that if you have an
array containing arrays, copies will be made of all those arrays.
If you don't want to copy recursively, or you know you don't have to copy recursively, you can use the plus operator instead. For instance, to create a copy of an array you simply add an empty array to it, like this: copy_of_arr = arr + ({}); If you need to copy a mapping you use an empty mapping, and for a multiset you use an empty multiset.
3.6. Variables
When declaring a variable, you also have to specify what type of variable it is. For most types, such as int and string this is very easy. But there are much more interesting ways to declare variables than that, let's look at a few examples:
int x; // x is an integer int|string x; // x is a string or an integer array(string) x; // x is an array of strings array x; // x is an array of mixed mixed x; // x can be any type string *x; // x is an array of strings // x is a mapping from int to string mapping(string:int) x; // x implements Stdio.File Stdio.File x; // x implements Stdio.File object(Stdio.File) x; // x is a function that takes two integer // arguments and returns a string function(int,int:string) x; // x is a function taking any amount of // integer arguments and returns nothing. function(int...:void) x; // x is ... complicated mapping(string:function(string|int...:mapping(string:array(string)))) x;
As you can see there are some interesting ways to specify types. Here is a list of what is possible:
- mixed
- This means that the variable can contain any type, or the function return any value.
- array( type )
- This means an array of elements with the type type.
- mapping( key type : value type )
- This is a mapping where the keys are of type key type and the values of value type.
- multiset ( type )
- This means a multiset containing values of the type type.
- object ( program )
- This means an object which 'implements' the specified program. The program can be a class, a constant, or a string. If the program is a string it will be casted to a program first. See the documentation for inherit for more information about this casting. The compiler will assume that any function or variable accessed in this object has the same type information as that function or variable has in program.
- program
- This too means 'an object which implements program'. program can be a class or a constant.
- function( argument types : return type )
- This is a function taking the specified arguments and returning return type. The argument types is a comma separated list of types that specify the arguments. The argument list can also end with ... to signify that there can be any amount of the last type.
- type1 | type2
- This means either type1 or type2
- void
- Void can only be used in certain places, if used as return type for a function it means that the function does not return a value. If used in the argument list for a function it means that that argument can be omitted. Example: function(int|void:void) this means a function that may or may not take an integer argument and does not return a value.