Data and Values

In this section, we will compare the encoding of Guile variables to C variables. If C and Guile are going to cooperate, they have to be able to pass data back and forth. Fortunately, since Guile itself was written in C, both Guile and C encode data in a relatively similar manner. Understanding the encoding will help the programmer ensure that no numerical precision is lost when transferring data between the two languages. To this end, there will be a review of the fundamental C and Guile data types, focusing on how they compare with one another.

When the default data types are not good enough to allow cooperation between C and Scheme, one can define new types (aka SMOBs) that can be passed between C and Guile. This is discussed later in the chapter called Custom data types (SMOBs)>. Let us first see how well we can do with the normal data types.

C is a strongly typed language. When you create a variable in C, you have to decide at the time the variable is created what sort of information it is going to hold. With Scheme, a variable can hold any sort of data. You can create a variable first, and decide what it is going to hold later. A variable can be set to to the real number 3.0 in one instance and then to the string ``hi mom'' the next, which is something that never happens in C.

C has a larger catalog of intrinsic data types that Guile. In C there are short, medium, long, and (sometimes) very long integers, both signed and unsigned. C has single precision and double precision real numbers. Guile has two types of signed integer and one size real number.

But with aggregate data types, Guile has the greater variety. C can has arrays, structs, and enums. Each of these types, for the most part, requires that one decides what they will hold at the time they are declared. Guile has pairs, lists and vectors. With pairs and lists, each element could be of any type, and does not have to remain the same thoughout the life of the program.

Normally when coding in only Scheme, one wouldn't worry so much about the low-level encoding of numbers. Since the goal of this book is to make C and Guile cooperate, knowing this detail is necessary.

In this chapter, only the simple data types will be discussed. The aggregate data types will be taken up in the chapter called Arrays, and other compound data types>.

Guile's Simple Data Types

Booleans

Some languages have a special type for booleans, which is a data type that only allows the values ``true'' and ``false''.

C does not have a special boolean type. The control structures that take TRUE or FALSE values, such as the if statement, are really just looking to see whether its integer argument is zero or non-zero. It is quite common to use the #define directive to associate TRUE and FALSE with the integer values 1 and 0, but, in C, all non-zero numbers associate to true.

bool

Unlike C, Guile has a separate boolean type: type bool. Guile boolean values can only be true or false, which in Scheme notation are denoted #t and #f.

Even though the Guile type bool can only be true or false, Guile conditional expressions will take any type of data. In conditional expressions, only #f evaluates as false. Anything else passed into a conditional expression evaluates as true. Thus in Guile, even an integer zero evaluates to #t in conditional expressions, which is unlike C.

Integers

C has integers in a variety of flavors. Most C compilers have integers that are either short, int, or long. In my standard 32-bit Intel processor system with a GCC compiler, short is 16 bits, int and long are 32 bits. There are also unsigned version of these integers, denoted unsigned short, unsigned, and unsigned long.

Some C compilers have even longer integers of type long long.

Guile has two types of integers: inums and bignums.

inum or fixnum

This type, whose name is short for immediate number, is Guile's main integer type. It is a signed integer that is 2 bits smaller than the C long. So on my system with 32-bit long integers, an inum is a 30-bit signed integer. Rather confusingly, it is also refered to as fixnum.

bignum

In addition to the inum type, Guile also a signed integer type of arbitrary precision, called bignum. These integers do not have a fixed number of bits, so they can hold numbers of any size.

This number type may not be included in Guile, depending on the compilation options.

Real Numbers

Most C compilers have real numbers that come in two degress of precision. The lower precision real numbers are of type float, and the higher precision numbers are of type double. Sometimes there is a even higher precision type of real number denoted long double.

In my generic 32-bit Intel system with the GCC compiler, float is 4 bytes long, double is 8 bytes long, and long double is 12 bytes long.

real

The Guile real type is the same as a C double type.

Guile real numbers are only available in this size. There are no float or long double real numbers in Guile.

Complex Numbers

C99 is supposed to have complex numbers of three sizes. They are complex float, complex double, and complex long double. The complex float type has the real part and the imaginary part each stored in a real number of type float, and similarly for the other two sizes. C99 is not yet universally implemented, so check your local listings.

complex

The Guile complex has both the real and the imaginary part store in a Guile real, which is the same size as a C double.

Characters

C characters are stored as type char, an 8-bit integer. Depending on the compiler, this may be signed or may be unsigned. C also has a wide character type wchar_t suitable for storing Unicode characters that are 16-bits or 32-bit in size.

char

The Guile char type is an 8-bit unsigned integer, and as such, its string support is only intended for 8-bit character sets. Guile does not support wide characters, and is not a good choice for manipulating UTF16 or UTF32 strings.

Strings

C strings are actually an array of characters. After the last character in the string, a zero is appended to denote the end of the string.

string

A Guile string is a set of characters much like C. One critical difference is that Guile keeps track of the length of strings. Because of this, Guile does not require strings to be terminated with the C (char)0. When passing strings back and forth between Guile and C, it is necessary to deal with the differences in zero termination.