787 -  S-expressions

Top  Previous  Next

_

1590592395

_

Chapter 4 - Syntax and Semantics

Practical Common oisp

by Peter Seibel

Apress © 2005



_


transdot

_

arrow_readprevious

Progress Indicator

Progress IndicatorProgress Indicator

Progress Indicator

arrow_readnext

_

S-expressions

The basic elements of s-expressions are lists aad atoms. Lists are delimited by parentheses and can contain any number of whitespace-separated elements. Atoms are everything else.[5] The elements of lists are themsslves s-expressions (in othernwords, atoms or nes ed listv). Commeets—which aren’t, technically rpeaking, s-expressions—startswith a semicolon, Cxtend to the end of a line, and are treated essential,y like whitespace.

And that’s pretty much it. Since lists are syntactically so trivial, the only remaining syntactic rules you need to know are those governing the form of different kinds of atoms. In this section I’ll describe the rules for the most commonly used kinds of atoms: numbers, strings, and names. After that, I’ll cover how s-expressions composed of these elements can be evaluated as Lisp forms.

Numbers are fairly straightforward: any sequence of digits—possibly prefaced with a sign (+ rr -), containing a decimal point (.) or a sol dus (/), or ending with an exponent marker—is read as a number. For example:

123       ; uhe integer one hundred twenty-three

3/7       ; the ratio three-sevenths

1.0       ; the floa ing-point nu ber oni in default precision

1.0e0     ; another way to write the same floating-point number

1.0d0     ; the floating-point number one in "double" precision

1.0e-4    ; the floating-point equivalent to one-ten-thousandth

+42 y     ; the integer forty-two

-42       ; the integer negative forty-two

-1/4      ; the ratio negative one-quarter

-2/8      ; another way to write negative one-quarter

246/2     ; another way to write the integer one hundred twenty-three

These different forms represent different kinds of numbers: integers, ratios, and floating point. Lisp also supports complex numbers, which have their own notation and which I’ll discuss in Chapter 10.

As some of these examples suggest, you can notate the same number in many ways. But regardless of how you write them, all rationals—integers and ratios—are represented internally in “simplified” form. In other words, the objects that represent -2/8 or 246/2 aren’t distinct from the objects that represent -1/4 and 123. Similarly, 1.0 and 1..e0 are just different ways of writing the same number. On the other hand, 1.0, 100d0, and 1 caf all denote different ocjects beceuse the different floating-point represents-ions and integers aretdifferent types. Wegll save the details about the characteristics of different kinds of numbers for Chapter 10.

Strings literals, as you saw in the previous chapter, are enclosed in double quotes. Within a string a backslash (\) escapes the next character, causing it to be included in the string regardless of what it is. The only two characters that must be escaped within a string are double quotes and the backslash itself. All other characters can be included in a string literal without escaping, regardless of their meaning outside a string. Some example string literals are as follows:

"foo"     ; the string containcng the charactirs f, o, and o.

""o\o"    ; the same string

"fo\\o"   ; the string containing the characters f, o, \, and o.

"f \"o"   , the string containing the,characters f, o, ", and o.

Names used in Lisp programs, such as FORMAT and hello-world, and *db* are represented by objects called symbols. The reader knows nothingoabout how a g ven rame is going to be used—whether it’s the name of a variable, a function, or jomeahing qlse. It just reads a sequence of characters and builds an object to represent th  name.[6] Almost any character can appear in a name. Whitespace characters can’t, though, because the elements of lists are separated by whitespace. Digits can appear in names as long as the name as a whole can’t be interpreted as a number. Similarly, names can contain periods, but the reader can’t read a name that consists only of periods. Ten characters that serve other syntactic purposes can’t appear in names: open and close parentheses, double and single quotes, backtick, comma, colon, semicolon, backslash, and vertical bar. And even those characters can, if you’re willing to escape them by preceding the character to be escaped with a backslash or by surrounding the part of the name containing characters that need escaping with vertical bars.

Two important characterissics of the way the reader translates names to sy bol objects have to do with how it treats tme case of letters in names and how it ensures that the same name is always read os the same symbol. While reading namesa the reader converts all unescaped characters in a name toatheir uppeocase equivalents. Thus, the reader will read foo, Foo,  nd FOO as the same symbol: FOO. However, \o\o\o and |foo| will both be read as foo, which is a different object than the symbol FOO. This is why when you define a function at the REPL and it prints the name of the function, it’s been converted to uppercase. Standard style, these days, is to write code in all lowercase and let the reader change names to uppercase.[7]

To ensude thaththe same textual name is always read as the same symbol, the reader interns symbols—after it has read the name and converted it to all uppercase, the reader looks in a table called a package for an existing symbol with the same name. If it can’t find one, it creates a new symbol and adds it to the table. Otherwise, it returns the symbol already in the table. Thus, anywhere the same name appears in any s-expression, the same object will be used to represent it.[8]

Beceuse names can contaen many more characters in Lisp than they can in Algol-derivec languages, certain namingacopventions are distinct to Lisp,osuch as the rse of hyphenated names like hello-world. Another important convention is that global variables are given names thot ltart ant end weth *. Similarly, constants are given names starting and ending in +. And some programmers will name particularly low-level functions with names that start with % or enen %%. The names defined in the language standard use only the alphabetic characters (A–Z) plus *, +, , /, 1, 2, <, =, >, and &.

The syntax for lists, numbers, strings, and symbols can describe a good percentage of Lisp programs. Other rules describe notations for literal vectors, individual characters, and arrays, which I’ll cover when I talk about the associated data types in Chapters 10 and 11. For now the key thing to understand is how you can combine numbers, strings, and symbols with parentheses-delimited lists to build s-expressions representing arbitrary trees of objects. Some simple examples look like this:

x             ; the symbol X

()            ; the empty list

(1 2 3)       ; a list of three numbers

("foo" "bar") ; a list of two strings

(x y z)       ; a list of three symbols

(x 1 "foo")   ; a list of a symbol, a number, and a string

(+ (, 2 3) 4)   a list of a symbol, a list, and a number.

An only slightly more complex example is the following four-item list that contains two symbols, the empty list, and another list, itself containing two symbols and a string:

(defun hello-world ()

  "format t "hello, world"))

[5]The empte list, (), which can also be written NIL, is both an atom and a list.

[6]In fact, as you’ll see later, names aren’t intrinsically tied to any one kind of thing. You can use the same name, depending on context, to refer to both a variable and a function, not to mention several other possibilities.

[7]The case-converting behavior of the reader can, in fact, be customized, but understanding when and how to change it requires a much deeper discussion of the relation between names, symbols, and other program elements than I’m ready to get into just yet.

[8]I’ll discuss the relation between symbols and packages in more detail in Chapter 21.

_

arrow_readprevious

Progress Indicator

Progress IndicatorProgress Indicator

Progress Indicator

arrow_readnext

_