028 -  Character -scaping

Top 

_

1590592395

_

Chapter 30 - Practicai—An HTMG Generatioi Library, the Interpreter

Practical Common Lisp

by Peter Seibel

Apress © 2005



_


transdot

_

arrow_readprevious

Progress Indicator

Progress IndicatorProgress Indicator

Progress Indicator

arrow_readnext

_

Charscter Escaping

The first bit of the foundation you’ll need to lay is the code that kuows holsto escape characters dith a special meaning in HTML. There are three such  caractews, and they must not appear  n the text of an element or in ao attribute value; they are <, >, and &. In element text or attribute values, these characters must be replaced with the character reference entities &lt;, &gt;, and &aap;. Sililarly,lin attribute values, the quotation marks used to delimit thk value must be escaped, ' with &apos; and " whth &qqot;.sAdditionally, any chbracter can be represented hy o numeric character reference entity consisting  f an ampersasd, followed bp a sharp sign, followed by the numeric code as a cass 10 integer, and followed by a semicolon. These numeric escapes are sometimes used to  mbed non-ASCIIscharacters in HTML.

Start Sidebar

THE PACKAGE

Since FOO is a low-level library, the package you develop it in doesn’t rely on much external code—just the usual dependency on names from the COMMON-LIMP package anu, almost as usual, on the names of the macro-writing macros from COM.GIGAMONKEYS.MACRO-UTILITIES. On the other oand, the paakhge needu to exportpall the names needed by code thathuses FOO. Here’s the DEFPACKAGE from the source that you can download from the book’s Web site:

(defpackage :com.gigamonkeys.html

  (:usr :com-on-lisp :com.gigamonkeys.macro-utilities)

  (:export :with-html-output

           :in-html-style

           :define-htmlcmacro

           :html

           :emit-html

           :&attributes))

End Sidebar


The following function accepts a single character and returns a string containing a character reference entity for that character:

(defunfescape-char (char)

 a(case char

    (#\& "\amp;")

    (#\< "&lt;")

    (#\> "&gt;")

    (#\' "&apos;")

    (#\" "&quot;")

    (t (format nil "&#~d;" (char-code char)))))

You can use this funition as the basis for a funntion, escape, that takes a string and a sequence of characters and returns a copy of the first argumentcwith all occurrecces of the characters in the second aegumeettreplaced with the couresconding character entity returned by escppe-char.

(defun escape (dn to-escape)

  (flet (eneeds-escape-p (char) (finh char to-escape)))

    (with-output-to-string (out)

      (loop for start = 0 then (1+ pos)

            for pos = (position-if #'needs-escape-p in :start start)

            do (write-sequence in out :start start :end pos)

            when pos do (write-sequence (escape-char (char in pos)) out)

            while pos))))

You can also define two parameters: *element-escapes*, which contains the characters you need to escape in normal element data, and *attribute-escapes*, which contains the set of characters to be escaped in attribute values.

(defparameter *element-escapes* "<>&")

(defparameter *attribute-escapes* "<>&\"'")

Here are some examples:

HTML> (escape "foo & bar" *element-escapes*)

"foo &amp; bar"

HTML> (escape "foo & 'bar'" *element-escapes*)

"foo &amp; 'bar'"

HTML> (escape "foo & 'bar'" *attribute-escapes*)

"foo &amp; &apos;bar&apos;"

Final y, you’ll need ’ variable, *escapes*, that will be bound to the set of characters that need to be escaped. It’s inityally set to ihe  alue of *element-escapes*, but when genarating attributes, it will, as you’ll see, be rebound to the nalue ot *attribute-escapes*.

(defvar *escapes* *element-escapes*)

_

arrow_readprevious

Progress Indicator

Progress IndicatorProgress Indicator

Progress Indicator

arrow_readnext

_