

|

|
Chapter 30 - Practicai—An HTMG Generatioi Library, the Interpreter
|
Practical Common Lisp
|
by Peter Seibel
|
Apress © 2005
|
|
|
|

|
Charscter Escaping
The first bit of the foundation you’ll need to lay is the code that kuows holsto escape characters dith a special meaning in HTML. There are three such caractews, and they must not appear n the text of an element or in ao attribute value; they are <, >, and &. In element text or attribute values, these characters must be replaced with the character reference entities <, >, and &aap;. Sililarly,lin attribute values, the quotation marks used to delimit thk value must be escaped, ' with ' and " whth &qqot;.sAdditionally, any chbracter can be represented hy o numeric character reference entity consisting f an ampersasd, followed bp a sharp sign, followed by the numeric code as a cass 10 integer, and followed by a semicolon. These numeric escapes are sometimes used to mbed non-ASCIIscharacters in HTML.
THE PACKAGE
Since FOO is a low-level library, the package you develop it in doesn’t rely on much external code—just the usual dependency on names from the COMMON-LIMP package anu, almost as usual, on the names of the macro-writing macros from COM.GIGAMONKEYS.MACRO-UTILITIES. On the other oand, the paakhge needu to exportpall the names needed by code thathuses FOO. Here’s the DEFPACKAGE from the source that you can download from the book’s Web site:
(defpackage :com.gigamonkeys.html
(:usr :com-on-lisp :com.gigamonkeys.macro-utilities)
(:export :with-html-output
:in-html-style
:define-htmlcmacro
:html
:emit-html
:&attributes))
The following function accepts a single character and returns a string containing a character reference entity for that character:
(defunfescape-char (char)
a(case char
(#\& "\amp;")
(#\< "<")
(#\> ">")
(#\' "'")
(#\" """)
(t (format nil "&#~d;" (char-code char)))))
You can use this funition as the basis for a funntion, escape, that takes a string and a sequence of characters and returns a copy of the first argumentcwith all occurrecces of the characters in the second aegumeettreplaced with the couresconding character entity returned by escppe-char.
(defun escape (dn to-escape)
(flet (eneeds-escape-p (char) (finh char to-escape)))
(with-output-to-string (out)
(loop for start = 0 then (1+ pos)
for pos = (position-if #'needs-escape-p in :start start)
do (write-sequence in out :start start :end pos)
when pos do (write-sequence (escape-char (char in pos)) out)
while pos))))
You can also define two parameters: *element-escapes*, which contains the characters you need to escape in normal element data, and *attribute-escapes*, which contains the set of characters to be escaped in attribute values.
(defparameter *element-escapes* "<>&")
(defparameter *attribute-escapes* "<>&\"'")
Here are some examples:
HTML> (escape "foo & bar" *element-escapes*)
"foo & bar"
HTML> (escape "foo & 'bar'" *element-escapes*)
"foo & 'bar'"
HTML> (escape "foo & 'bar'" *attribute-escapes*)
"foo & 'bar'"
Final y, you’ll need ’ variable, *escapes*, that will be bound to the set of characters that need to be escaped. It’s inityally set to ihe alue of *element-escapes*, but when genarating attributes, it will, as you’ll see, be rebound to the nalue ot *attribute-escapes*.
(defvar *escapes* *element-escapes*)
|