From posting-system@google.com Tue Jul 16 03:26:02 2002
Date: Mon, 15 Jul 2002 20:25:53 -0700
From: oleg@pobox.com (oleg@pobox.com)
Newsgroups: comp.lang.scheme
Subject: Scheme->syntax-rules [Was: hygienic macro primality tester]
Message-ID: <7eb8ac3e.0207151925.37dfbd59@posting.google.com>
Status: OR

A more systematic way of writing syntax-rules macros is to
mechanically convert the corresponding Scheme procedures. Scheme
functions are far easier to develop and debug. After we are sure our
Scheme code works, we submit our procedures to a
Scheme--to--syntax-rules compiler -- and use but don't look at the
result. As an example, this article develops a more general primality
tester syntax-rule macro based on the Eratosthenes sieve. This
approach also helps Joe Marshall, who wrote "I'm in need of an
INTERCAL compiler written in SYNTAX-RULES" Well, he can just compile
the existing INTERCAL compiler to syntax-rules.

* Eratosthenes sieve
* Pure-functional, generic implementation of the Eratosthenes sieve
* Compilation of the Scheme implementation to syntax-rules macros


* Eratosthenes sieve

We should first remark that most of the examples of the Eratosthenes
sieve -- in particular, the examples in Haskell distributions -- are
incorrect. They mis-represent the sieve algorithm. The beauty of the
Eratosthenes sieve is the complete absence of division (or related
'mod') operations. A division-free algorithm was quite an advantage in
ancient times. This fact lets us encode the primality testing with the
minimum of arithmetics.

The Eratosthenes sieve algorithm trades space for speed. We write out
numbers on a piece of papyrus, starting from 2 till the desired upper
limit. We take the first number -- two in our case -- and hole out
every second number. We scan the piece for the next non-holed-out
number (which will be three) and hole out every third number, starting
from three. At the end, only the prime numbers will be left. The
Eratosthenes sieve, according to the above description, has an
imperative flavor. That is how it is often implemented [Note 1].
Nothing however prevents us from re-writing the algorithm in a pure
functional style, as the next section shows.

The Eratosthenes sieve algorithm can be used for testing if a number n
is prime. We build a list of numbers 2 through n and run the
algorithm. If the last number on the list is "holed out", n is
composite. Otherwise, it is prime.


* Pure-functional, generic implementation of the Eratosthenes sieve

First, we need to build a list of numbers 2 through n.

; Given a number 'n', build a list (2..n)
(define (iota n)
  (letrec
      ((loop 
	(lambda (curr counter)
	  (if (less-than-two? counter) '()
	      (cons curr (loop (incr curr) (decr counter)))))))
    (loop (number-two) n)))

Note how generic this function is: it deals with integers as an
abstract data type. We do not care how integers are actually
implemented.

; The sieve algorithm itself
; Given the list (2..n), return a sieved list
; all composite numbers are replaced by 0

(define (sieve lst)
  (letrec
      ((choose-pivot ; choose the first non-zero element in lst
	(lambda (lst)
	  (cond
	   ((null? lst) lst)
	   ((number-zero? (car lst)) ; skip over zeros
	    (cons (car lst) (choose-pivot (cdr lst))))
	   (else 
	    (cons (car lst)          ; sieve with the pivot, and recurse
		  (choose-pivot
		   (do-sieve (car lst) (decr (car lst)) (cdr lst))))))))
       ; Run the sieve
       ; (do-sieve step current lst)
       ; Return the new lst with each step-th element sieved out:
       ; replaced by 0.
       (do-sieve
	(lambda (step current lst)
	  (cond
	   ((null? lst) lst)
	   ((number-zero? current)    ; replace the current element with zero
	    (cons (number-zero) (do-sieve step (decr step) (cdr lst))))
	   (else
	    (cons (car lst) (do-sieve step (decr current) (cdr lst)))))))
       )
    (choose-pivot lst)))

; Check if a given number is a prime: return 'prime or 'composite
; We run the sieve and then check if the last element has been weeded out

(define (is-prime n)
  (if (number-zero? (car (reverse (sieve (iota n)))))
      'composite
      'prime))

To test these functions, we need to supply an implementation of
integers. First we use the most convenient representation:

(define (number-zero) 0)
(define (number-two) 2)
(define (incr n) (+ 1 n))
(define (decr n) (- n 1))
(define (less-than-two? n) (< n 2))
(define number-zero? zero?)

BTW, these are the only integer "methods" that we need to run
the Eratosthenes sieve. We don't even need general addition, subtraction,
or comparisons -- let alone multiplication or division.
Here are sample test cases and their output:

(display (iota 5))
   ;==> (2 3 4 5)
(display (sieve (iota 5)))
   ;==> (2 3 0 5)
(display (sieve (iota 25)))
   ;==> (2 3 0 5 0 7 0 0 0 11 0 13 0 0 0 17 0 19 0 0 0 23 0 0)
(display (is-prime 19)) ;==> prime
(display (is-prime 25)) ;==> composite

Next, we choose a Peano-Church representation of integers:

(define (number-zero) '() )
(define (number-two) '(( () )) )
(define (incr n) (list n))
(define (decr n) (car n))
(define (less-than-two? n) (or (null? n) (null? (car n))))
(define number-zero? null?)

The tests become

(display (iota '((((( () ))))) ))
   ;==> (((())) (((()))) ((((())))) (((((()))))))
(display (sieve (iota '((((( () ))))) )))
   ;==> (((())) (((()))) () (((((()))))))
(display (is-prime '((((((((((((((((((( () ))))))))))))))))))) ))
   ;==> prime
(display (is-prime '((((((((((((((((((((((((( () ))))))))))))))))))))))))) ))
   ;==> composite

Everything seems to work as expected.


* Compilation to syntax-rules.
Now, we can mechanically compile iota, sieve and is-prime
procedures to syntax-rule macros. First, however, we need to implement
the primitives number-zero, number-two, etc. as syntax-rule macros,
and add them to the initial environment of the compiler:

     ,(lambda (env)
	(env-extend env 'incr '?incr
	   applies-directly: #t
	   cps-code:
	   '(define-syntax ?incr
	      (syntax-rules ()
		((_ n k) (??!apply k  (n) ))))))

     ,(lambda (env)
	(env-extend env 'decr '?decr
	   applies-directly: #t
	   cps-code:
	   '(define-syntax ?decr
	      (syntax-rules ()
		((_ (n) k) (??!apply k  n ))))))


     ,(lambda (env)
	(env-extend env 'less-than-two? '?less-than-two?
	   applies-directly: #t
	   if-fork: 'ifless-than-two?
	   cps-code:
	   '(define-syntax ?less-than-two?
	      (syntax-rules ()
		((_ ((n)) k) (??!apply k #f))
		((_ x k)  (??!apply k #t))))))

etc.


Now, we can just run the compiler:

(define test-Eratosthenes
  '(
    (define (iota n)
	; exactly as above
    )
   (define (sieve lst)
	; exactly as above
   )
   (define (is-prime n) ; exactly as above
     (if (number-zero? (car (reverse (sieve (iota n)))))
	 'composite
	 'prime))
   (define (reverse lst)
     (letrec ((loop
	       (lambda (lst accum)
		 (if (null? lst) accum
		     (loop (cdr lst) (cons (car lst) accum))))))
       (loop lst '())))
))

(write-out init-env test-Eratosthenes
	   ; a test case
 	   '(?iota ((((((( () )))))))
 	       (??!lambda (x) (?sieve (??! x)
 		    (??!lambda (x) (display '(??! x))))))
)

See the full code at
	http://pobox.com/~oleg/ftp/Scheme/cps-macro-conv.scm

The compilation produces a file "/tmp/a.scm". We can interpret the
compiled code:
   ~> bigloo -i -hygien /tmp/a.scm
which prints the result:
	(((())) (((()))) () (((((()))))) () (((((((()))))))))
If we macroexpaned the code, e.g.,
   ~> bigloo -hygien -syntax /tmp/a.scm
we would see that this result is indeed computed at macro-expand time.

I do not dare to show the compilation result. Ok, I will, but only a
little:

   (define-syntax
     ?is-prime
     (syntax-rules
       ()
       ((_ _?n _?kg1074)
        (?iota _?n
               (??!lambda
                 (g1081)
                 (?sieve
                   (??! g1081)
                   (??!lambda
                     (g1080)
                     (?reverse
                       (??! g1080)
                       (??!lambda
                         (g1079)
                         (?car (??! g1079)
                               (??!lambda
                                 (g1078)
                                 (?number-zero?
                                   (??! g1078)
                                   (??!lambda
                                     (g1075)
                                     (?iftrue?
                                       (??! g1075)
                                       (??!lambda
                                         (g1076)
                                         (??!apply _?kg1074 composite))
                                       (??!lambda
                                         (g1077)
                                         (??!apply
                                           _?kg1074
                                           prime))))))))))))))))

       (define-syntax
         ?iota
         (syntax-rules
           ()
           ((_ _?n _?kg1029)
            (letrec-syntax
              ((?loop (syntax-rules
                        ()
                        ((_ _?currg1031 _?counterg1032 _?kg1030)
                         (?ifless-than-two?
                           _?counterg1032
                           (??!lambda (g1033) (??!apply _?kg1030 ()))
                           (??!lambda
                             (g1034)
                             (?incr _?currg1031
                                    (??!lambda
                                      (g1036)
                                      (?decr _?counterg1032
                                             (??!lambda
                                               (g1037)
                                               (?loop (??! g1036)
                                                      (??! g1037)
                                                      (??!lambda
                                                        (g1035)
                                                        (?cons _?currg1031
                                                               (??! g1035)
                                                               _?kg1030)))))))))))))
              (?number-two
                (??!lambda
                  (g1038)
                  (?loop (??! g1038) _?n _?kg1029)))))))


The assembler (er, CPS) code is never pretty.

To test the primality tester, we compile the following code
(write-out init-env test-Eratosthenes
	   '(?is-prime (((((( () ))))))
		    (??!lambda (x) (display '(??! x))))
)
which macro-expands to (display 'composite)
and
(write-out init-env test-Eratosthenes
	   '(?is-prime ((((( () )))))
		    (??!lambda (x) (display '(??! x))))
)
which macro-expands to (display 'prime). Indeed, 6 is a composite
number and 5 is a prime.


The compilation to syntax-rules is hardly optimized. The
result is very slow. In general, the implementation of
syntax-rules on Bigloo (and many other systems) is slow and takes lots
of memory -- that's why it is off by default. We need to specify a
special flag -hygien to activate syntax-rules macros on Bigloo. I
guess no one bothered to optimize the macro expander, because no one
asked for that.

We must emphasize however the speed of developing the macros. We do
all the development and testing with regular Scheme procedures. Once
we have done all that, the translated syntax-rules macros _just work_. There
is no need for macrology any more.


Note 1. An optimized imperative-style Eratosthenes sieve is
implemented in
	    http://pobox.com/~oleg/ftp/Scheme/Eratosthenes-sieve.txt
That implementation took advantage of the fact that even numbers
greater than two are composite -- therefore, there is no point of
storing them. This improves the space and time complexities of the
implementation two-fold. The article computed all prime numbers under
one million, and verified a previously proven equality
    Prod{ (1 - 1/p[i]^2), i=1..K} -> 1/Z[2] = 6/Pi^2
where p[i] is the i-th prime number (p[1]=2)