From www@deja.com Fri Jan 5 18:36:25 2001 Message-ID: <9360rf$o8i$1@nnrp1.deja.com> From: oleg@pobox.com Subject: XML as Scheme, or Scheme as XML [was: XEXPR needs Schemers' help] Date: Sat, 06 Jan 2001 02:45:03 GMT Reply-To: oleg@pobox.com Newsgroups: comp.lang.scheme,comp.text.xml References: X-Article-Creation-Date: Sat Jan 06 02:45:03 2001 GMT X-Comment: Corrected: post-order is moved to SXML-tree-trans.scm; more elegant handling of quoted symbols and strings, better XML names for standard Scheme symbols like + and * Status: OR An alternative to embedding of Scheme into XML (see W3C Technical Note XEXPR) is an embedding of XML into Scheme. This embedding -- SXML [1] -- is well-defined: a SSAX parser [2] takes a well-formed XML document and returns a corresponding S-expression. SXML is designed in such a way that it may denote an executable Scheme code. That is, the output of the SSAX parser is not only "data" but also an "expression" (following the R5RS syntax definitions). An SXML pretty-printer [1] takes an S-expression -- which may denote Scheme data or Scheme code -- and turns it into HTML or XML. The rest of this article will demonstrate that this procedure is in some sense invertible. That is, we can mechanically convert between Scheme code or data and XML. We can use XML (i.e., Scheme) to write transformations of an _arbitrary_ XML document (i.e., S-expression). Some parts of this assertion have been demonstrated previously [1,3]. This article will concentrate on code transformations and viewing XML as an executable code. As first example, let us consider the following Scheme expression, from SRFI-13: (string-join '("foo" "bar" "baz") ":") Applying a transformation (SRV:send-reply (post-order (with-input-from-string "(string-join '(\"foo\" \"bar\" \"baz\") \":\")" read) bind3)) gives foobarbaz : The definitions of post-order and bind3 are given in the appendix. If we feed the above XML string into a SSAX parser: (call-with-input-string doc-string (lambda (port) (SSAX:XML->SXML port '()))) (where doc-string is assumed to be bound to the XML string), we arrive roughly at where we have started: (string-join (list (str "foo") (str "bar") (str "baz")) (str ":")) This S-expression is a genuine Scheme expression, and can be evaluated as such. The result will be the same as that of the original expression, provided that we define 'str' as the identity function. A few other examples follow. As before, we use a 'post-order' function with 'bind3' bindings as defined in the appendix to convert from a Scheme expression to XML. We use a SSAX parser to turn XML into SXML. Scheme: (string-join '() ":" 'suffix) XML: :suffix SXML: (string-join (empty) (str ":") (qsymb "suffix")) We should define 'empty' to return '() and qsymb as string->symbol. Scheme: (+ 1 (* 2 3)) XML: 1 23 SXML: (plus (num "1") (star (num "2") (num "3"))) if we define 'num' as string->number and bind plus to an addition and star to a multiplication procedures, we can even compute that expression. Scheme: (if (> 1 2) #t (if #f #f)) XML: <_Ex3E>12 SXML: (if (_Ex3E (num "1") (num "2")) (true) (if (false) (false))) Again, Scheme->XML->SXML transformation is the identity modulo trivial bindings for 'true' and 'false' procedures. The SXML pretty-printer relies on a function post-order, which traverses an SXML -- or, any Scheme expression in general -- and evaluates nodes post-order. The 'post-order' function can be used to evaluate a Scheme expression directly. For example, (post-order (with-input-from-string "(+ 1 (* 2 3))" read) ; bindings for SXML "tags" `((+ . ,(lambda (trigger . values) (apply + values))) (* . ,(lambda (trigger . values) (apply * values))) (*text* . ,(lambda (trigger value) value)) (*default* . ,(lambda (trigger . value) (error "undefined " trigger))))) ==> 7 This should come as no surprise as 'post-order' *is* an implementation of eval -- or eval is a post-order traversal for a Scheme expression tree. Still it appears curious that the same function used to pretty-print Scheme code into XML can be used to evaluate the Scheme code -- given the appropriate "bindings". [1] http://pobox.com/~oleg/ftp/Scheme/SXML-short-paper.html [2] http://pobox.com/~oleg/ftp/Scheme/xml.html#XML-parser This SXML parser is written in Scheme. The parser can handle XML Namespaces, CDATA sections, processing instructions, etc. [3] http://pobox.com/~oleg/ftp/Scheme/xml.html#eval-SXML Appendix. ; Gambit-3.0 (define bind3 `( ; first handle special forms (quote ; Local override for *default* within a quote ; '(a b) is _not_ an application of a to b ((*default* . ,(lambda (trigger . value) ; trigger must be a symbol (list "" trigger "" value)))) ; quote handler . ,(lambda (trigger . value) (cond ((not (pair? value)) (error "bad quotation:" value)) ((pair? (car value)) (list "" value "")) ((null? (car value)) (list "")) (else value)))) (*text* . ,(lambda (trigger value) (cond ((number? value) (list "" value "")) ((eq? value #t) (list "")) ((eq? value #f) (list "")) (else (list "" value ""))))) (*default* . ,(lambda (trigger . value) (let ((trigger (symbol->goodXML trigger))) (list #\< trigger #\> value "\n")))))) (define (symbol->goodXML symb) (define (good-start-char? a-char) (or (char-alphabetic? a-char) (char=? #\_ a-char))) (define (good-name-char? a-char) (or (char-alphabetic? a-char) (string-index "0123456789.-_" a-char))) (define (every? pred lst) (or (null? lst) (and (pred (car lst)) (every? pred (cdr lst))))) (let ((strl (string->list (symbol->string symb)))) (cond ((and (good-start-char? (car strl)) (every? good-name-char? (cdr strl))) symb) ((assq symb '((+ . "plus") (- . "minus") (< . "lt") (* . "star"))) => cdr) (else (apply string-append (cons "_E" (map (lambda (c) (string-append "x" (number->string (char->integer c) 16))) strl))))) )) The other procedures are defined in http://pobox.com/~oleg/ftp/Scheme/SXML-tree-trans.scm