* Machine reasoning about machines: ACL2 theorem prover J. Strother Moore, (University of Texas, Austin) PADL2002 Invited Talk, January 20, 2002. J. Strother Moore is well-known for the Boyer-Moore string matching algorithm and the Boyer-Moore theorem prover. The talk was quite different from the paper (published in LNCS 2257). The paper, entitled "Single-Threaded Objects in ACL2", deals with ACL2's "version" of Haskell monads. The PADL talk was about history, abilities and lessons of the ACL2 prover. ACL2 stands for "A Computational Logic for Applicative Common Lisp". It is the second incarnation of "The Boyer-Moore Theorem Prover (NQTHM)". The original NQTHM project started back in the 1971, when Moore was a PhD student at the U. of Edinburgh after graduating from MIT, and when he met Boyer. They were influenced by Woody Bledsoe and an insight that proving theorems by resolution was actually _programming_ in predicate calculus. The collaboration between Moore and Boyer continued for twenty years. Both moved around the country, and ended up at the U. of Texas (Austin). And then all of the sudden Boyer got interested in philosophy. He is now a professor in the Dept of philosophy at U.Texas (Austin), and teaches Plato and Socrates as well as the introductory Set theory for CS majors. J. Moore mentioned that his department still receives letters addressed to "Professor Boyer-Moore". NQTHM uses Lisp code as logic, to express theorems. The prover is written also in (a home-grown dialect of) Lisp. The logic is first order and has no quantifiers. The prover however is adept in inductive proofs. The first result of the Boyer-Moore prover was the proof of the existence of a list with three elements. Then they proved that the list append is associative, and the prime factorization is unique. The prover got smarter and acquired the knowledge of natural, integral and rational numbers (along with their equality and inequality properties), binary decision diagrams (BDDs), and congruence-based re-writing. http://www.cs.utexas.edu/users/moore/best-ideas/nqthm/innovations.html Each proven lemma is automatically added to the database of known true facts -- and can immediately be used in proving other theorems. We can also prove re-writing rules and reasoning strategies -- and add them to the database for later use. The prover becomes smarter with every successful proof. The brief ACL2 tutorial http://www.cs.utexas.edu/users/kaufmann/tutorial/rev3.html gives the flavor of the logic and the prover. In the 80s, Boyer, Moore and their students proved undecidability of the halting problem, and the Goedel First incompleteness theorem. The latter proof required 2000 lemmas. They proceeded to prove the design of a sample microprocessor, the correctness of the assembler for the micro-processor, and even the correctness of a simple OS. See http://www.cs.utexas.edu/users/moore/best-ideas/nqthm/index.html for the complete list of proved results. One of the notable results of ACL2 was a proof that the floating-point unit (FPU) of the AMD Athlon processor executes elementary FP operations (additions, subtraction, multiplication, division and the square root) strictly according to the IEEE specification. To be precise, they proved that the unit that operates 'extended precision' FP numbers is correct. The regular FP numbers are converted to the extended format by a separate hardware unit, which also deals with unnormalized numbers, NaNs and infinities. This was at a time of a Pentium FDIV scandal. AMD management was terrified to think that the upcoming Athlon may have some rare but immensely embarrassing bug. The FP unit was specified in a Register Transfer Language (RTL) -- a common hardware description language that describes registers of various logical/arithmetical components and data transfers. To prove the correctness of FPU, Moore and his group had to develop a translator from RTL to ACL2, the input language of the prover. Moore had to convince the AMD management that the translation is sound. Alas, RTL is not specified precisely; its semantics is defined by an RTL emulator. The fact that ACL2 specification is itself ACL2 code, which is applicative Common Lisp code, turned out very handy. ACL2 specifications can be executed! The ACL2 _specification_ for the FPU can therefore be considered an emulator for the FPU: it takes bit patterns representing input FP numbers and prints the output bit patterns. To convince the management that the ACL2 FPU specification is a sound representation of the original RTL specification, one needs to show that the two FPU emulators are equivalent. The management gave Moore 80 million test vectors, designed to verify the RTL specs. The compiled ACL2 specs code produced the same results for these vectors as the RTL emulator. That convinced the AMD management that the ACL2 specs accurately reflect the RTL code for their FPU. After that, Moore and his students proceeded to prove that the ACL2 FPU specs are consistent with the IEEE arithmetic rules. They found four real bugs -- which were confirmed as bugs by the RTL emulator. These four bugs survived all perviously tried 80 million test vectors! Moore and his students later proved correctness theorems for an IBM 4758 crypto-processor. The theorems contributed to the crypto-processor's being awarded IFIPS 140-1 rating, the highest security rating for any piece of hardware and software. http://www.cs.utexas.edu/users/moore/best-ideas/acl2/index.html See above for more (industrially important) results proved by ACL2. In 1989, Boyer and Moore opened an empty Emacs buffer and started a complete re-write of NQTHM. The result is the ACL2 prover. Theorems in NQTHM were expressed as Lisp expressions; the prover itself was written in a home-grown dialect of Lisp, in a largely imperative style. First Boyer and Moore wanted to switch to Common Lisp (CL), so they don't have to maintain their own compiler. Mainly they wanted to re-write the prover in a pure-applicative subset of CL. The major attraction is that the specification and the implementation languages become the same; hence we can use the prover to prove (a part of) itself. It turns out that pure-functional style resulted in a better and a better maintainable code. As J. Moore remarked, his experience of writing and maintaining the imperative prover NQTHM for 18+ years, and the similar 11-year experience for the pure-functional ACL2 prover showed the applicative style is better. J. Moore said that he will never go back to the imperative style. The ACL2 prover is 72KLOC long. The documentations takes 1.7 MB; in addition; comments in the code occupy 1.2 MB. Whenever the authors extent or correct a function, they describe the motivation and the action in the comments. Thus ACL2 comments reflect thirty years of the evolution of the theorem provers. The ACL2 home page http://www.cs.utexas.edu/users/moore/acl2 lists two books, several ACL2 workshops. The site contains the full, hyperlinked ACL2 documentation. ACL2 is freely downloadable from the above site. As the ACL2 web page says, "I think it's really cool that the system is coded in its own logic. How many theorem provers can you say that about? How many 5 Mbyte applicative programs can you name?" The ACL2 is used quite frequently for real projects (proofs of correctness for the AMD Athlon FPU, for a Motorola DSP processor, for a JVM chip and for a strcpy()/strcat()/... UNIX string library), partly because it is fast. ACL2 has plenty of tricks to speed up proofs, for example, "lazy" beta-reduction. The latter tries to bundle several pending beta-reductions and execute them during a single traversal of a potentially huge term. For example, a correctness proof of a Motorola CAP DSP processor generated an intermediate formula that was 25MB long. ACL2 can deal with even bigger formulas. The ACL2 specification language and the code itself are untyped. However, ACL2 can infer correctness constraints: e.g., if the statement of a theorem or a function includes (CAR L), then L must be a CONS. The ACL2 tutorial gives a great example. When a user types:
ACL2 !>(defun rev (x)
         (if (endp x)
             nil
           (append (rev (cdr x))
                   (list (car x)))))
The system responds with
The admission of REV is trivial, using the relation E0-ORD-< (which is known to be well-founded on the domain recognized by E0-ORDINALP) and the measure (ACL2-COUNT X). We observe that the type of REV is described by the theorem (OR (CONSP (REV X)) (EQUAL (REV X) NIL)). We used primitive type reasoning and the :type-prescription rule BINARY- APPEND.
Note how the system figured out the type of rev. The type is formulated as a theorem, which can be proved by ACL2 and then used as a side condition whenever function rev is applied. Curry-Howard isomorphism in action! For that reason, a static type system is not necessary. Still, J. Moore keeps an open mind. When asked what he would do if he started with an empty Emacs buffer now, he replied that he would give Haskell a very hard look. One of his (former) students is playing with Haskell to this end; so far, Moore said, he was able to closely and elegantly emulate in ACL2 all the Haskell code that person sent him. J. Moore is a bit concerned how Haskell will play out in terms of performance. He didn't mention any studies though. The speaker couldn't avoid a question about Boyer-Moore string matching algorithm. He said that its discovery was one of the "Aha" moments -- one day in California.