Programming and Computation



Programming as collaborative reference

We argue that programming-language theory should face the pragmatic fact that humans develop software by interacting with computers in real time. This interaction not only relies on but also bears on the core design and tools of programming languages, so it should not be left to Eclipse plug-ins and StackOverflow. We further illustrate how this interaction could be improved by drawing from existing research on natural-language dialogue, specifically on collaborative reference.

Overloading resolution is one immediate application. Here, collaborative reference can resolve the tension between fancy polymorphism and the need to write long type annotations everywhere. The user will submit a source code fragment with few or no annotations. If the compiler gets stuck -- e.g., because of ambiguous overloading -- it will ask the user for a hint. At the successful conclusion of the dialogue, the compiler saves all relevant context, in the form of inserted type annotations or as a proof tree attached to the source code. When revisiting the code later on, the same or a different programmer may ask the compiler questions about the inferred types and the reasons for chosen overloadings. We argue that such an interactive overloading resolution should be designed already in the type system (e.g., using oracles).

Joint work with Chung-chieh Shan.

The current version is January 2012
programming-collaborative-ref.pdf [131K]
The extended abstract, submitted to the ``Off-the-beaten track'' workshop affiliated with POPL 2012.


Markov Algorithms

If asked to name models of computations, Turing machines, lambda calculus and, perhaps, register machines would be the common answers. It is much less likely to hear Post systems or Markov algorithms. Which is regrettable. Inference rules, so familiar in logic and theoretical computer science, are a manifestation of Post systems. So are template engines, EBNF rules, XSLT -- or macros. It may be insightful to look closely at these systems and their rather surprising expressivity results. After all, it was insightful to Chomsky, whose context-free and other generative grammars were influenced by Post systems. Computational models are not just of theoretical interest. Markov algorithms inspired a real programming language, Refal, which was based on pattern matching long before it has become common. Peephole optimizers are also Markov algorithms.

Post systems (think of generative grammars) are string transformations based on repeated, perhaps context-sensitive, substitutions -- replacements of substrings by other strings. The substitutions are expressed via a finite sequence of rules (grammar productions). The order of applying the rules is not defined. Normal Markov algorithms, true to the name, are a restricted, `normal' form of rule-based substitution systems. Substitutions are performed in the strictly defined order and regardless of the surrounding context. The whole string transformation process runs deterministically and can be done by a simple mechanism, a `machine'. Normal Markov algorithms deserve to be called `algorithms'. (There is a play of words missing in English: `algorithm' is `алгоритм' in Russian. Markov called his system `алгорифм'.)

Normal Markov algorithm is hence a machine that repeatedly rewrites the given input string according to an ordered sequence of rewriting rules. Each rule is a pair of strings: the source src and the replacement rplc, each of which may be empty. Some rules may be marked as terminal. The work cycle of the machine is: find the first rule in the sequence whose src appears as a substring of the input string. Replace the leftmost occurrence of the src with the rplc of the rule. If the rule is terminal, stop. Otherwise, start the cycle anew, the just rewritten string being the input. If no rule applies, stop.

As an example, the following sequence of rules, written as an OCaml array, converts a big-endian binary number (a string of 0s and 1s) into the string of vertical bars encoding the same number in unary.

    let bin_to_unary = [|
     rule "1"   "0|";
     rule "|0"  "0||";
     rule "0"   "";
For example, run bin_to_unary "110", using the OCaml code below, produces the string "||||||" of six bars. The code also prints the trace of fired rules and their intermediate results, showing the working of this clever algorithm.

Curious readers may want to work out how the following rules

    let gcd = [|
      rule "aA"  "Aa";
      rule "a#a" "A#";
      rule "a#"  "#B";
      rule "B"   "a";
      rule "A"   "C";
      rule "C"   "a";
      rule "#"   "";
manage to rewrite the string made of i letters a followed by # and more j letters a into the string of k letters a where k=gcd(i,j).

Markov has shown that all known classes of string transformations performed by a finite sequence of rewriting rules can be expressed as a normal algorithm (that is, `normalizable'). This let him pose that all algorithms are normalizable. He justified his thesis more than either Turing or Church did for the corresponding Church-Turing thesis. Here is how V.M.Glushkov (cited from the English translation) describes this justification:

The validity of this [Markov normalization] principle is based first of all on the fact that all the algorithms known at the present time are normalizable. Since in the course of the long history of the development of the exact sciences a considerable number of different algorithms have been devised, this statement is convincing in itself.

In actuality it is even more convincing. We can show that all the methods known at the present time for the composition of algorithms which make it possible to construct new algorithms from the already known ones do not go beyond the limits of the class of normalizable algorithms. ...

However this is not all. A whole series of scientists have undertaken special attempts to construct algorithms of a more general form and all these attempts have not been carried beyond the limits of the class of normalizable algorithms.

I'd like to clear a common misunderstanding about the exact period when A.A.Markov developed his theory of algorithms. All the English sources that I have seen claim that Markov Algorithms was the work of 1960s -- most likely, based on the publication date of the very brief English translation of Markov's work. However, the theory was published in Russian quite a bit earlier. According to the bibliography of A.A.Markov (referenced below), the first paper `Theory of algorithms' appeared in Steklov's Mathematics Institute journal in 1951. Three years later, in 1954, A.A.Markov published a 376-page book with the same title. He was working in the area of string algorithms already in 1947, when he published his famous result: undecidability of word problems in semigroups.

The current version is October 2019
References [6K]
The complete OCaml code using only the standard library
The code implements the `Markov machine' and contains several examples, including binary-to-unary conversion, addition and subtraction of unary numbers, and the emulation of the Turing machine for unary addition.

Complete list of publications by A.A.Markov

Андрей Андреевич Марков (1903-1979)
This comprehensive web site also has rare family photos (showing A.A.Markov's father, the inventor of Markov chains) and poems written by A.A.Markov

Andrey Andreevich Markov 1960. The Theory of Algorithms. American Mathematical Society Translations, series 2, 15, 1-14.

В.М.Глушков. Введение в Кибернетику
Издательство Академии Наук Украинской ССР. Киев, 1964. 324 стр.
There is an English translation:
V.M.Glushkov. Introduction to Cybernetics
U.S. Air Force Systems Command. Wright-Patterson Air Force Base. Ohio. Translation division. Foreign technology division. FTD-TT-65-942/1+2. AFLC-WPAFB-Mar 66 72. Feb 17, 1966.

V.F.Turchin. The Concept of a Supercompiler
ACM Transactions on Programming Languages and Systems, 1986, v8, 292--325. doi:10.1145/5956.5957
Sec 4 of the paper is probably the best presentation of Refal in English, along with many examples and comparison with Lisp and Prolog. Turchin emphasized that Refal is a pure functional language.


Functional and Logic Programming Symposium FLOPS 2016: Special issue of selected papers

This special issue comprises the extended, journal versions of the seven papers first presented at the Thirteenth International Symposium on Functional and Logic Programming. FLOPS 2016 was held in March 4-6, 2016 in Kochi, Japan. The symposium program included two invited talks, 14 paper presentations, three 1.5hr tutorials and a poster session. At the conclusion of the conference, the program committee selected 8 papers and invited their authors to submit an extended and improved version. Each submission was reviewed by three reviewers according to rigorous journal standards. Seven have made it to this special issue.

FLOPS brings together practitioners, researchers and implementers of declarative programming, to discuss mutually interesting results and common problems: theoretical advances, their implementations in language systems and tools, and applications of these systems in practice. The scope includes all aspects of the design, semantics, theory, applications, implementations, and teaching of declarative programming. FLOPS specifically aims to promote cross-fertilization between theory and practice and among different styles of declarative programming. This wide scope and the interplay of theory and practice is reflected in the papers of this special issue.

Logic/Constraint Programming and Concurrency: The Hard-Won Lessons of the Fifth Generation Computer Project, by Kazunori Ueda
FLOPS 2016 opened with the invited talk by Kazunori Ueda, who has reflected on the influential Fifth Generation Computer Systems project (1982-1993) and the declarative language developed for its massively concurrent hardware. Ueda was the designer of Guarded Horn Clauses, which shaped Kernel Language 1 (KL1). The latter was both the systems and the general-purpose programming language of the Parallel Inference Machine. His paper is a personal history of Guarded Horn Clauses, told in the context of programming models of 1980s and the present-day constraint-based concurrency that grew out of that project. The paper ends with an illustration of KL1, through its translator to C developed at the final year of the project. It runs on modern machines and shows excellent scalability.
Incremental Computing with Data Structures by Akimasa Morihata
The paper, albeit theoretical, deals with a rather practical problem: minimizing recomputations when an input data structure is updated. It extends Jeuring's incremental computing method on algebraic data structures to complex data structures such as splay trees or zippers, and to any bottom-up structurally recursive computation.
Embedding the Refinement Calculus in Coq by Joao Alpuim and Wouter Swierstra
The authors take on the holy grail of software engineering: deriving programs from their specifications -- to be more precise, refining specifications to programs. Although the refinement calculus has been proposed a long time ago, using it `by hand' is error-prone and extremely tedious. Alpuim and Swierstra present the embedding of the calculus in Coq, which helps with both problems. Refinements are provably correct and result in formally verified programs. In addition, Coq offers a convenient interactive environment, with a reasonable degree of automation.
A Coq Library For Internal Verification of Running-Times by Jay McCarthy, Burke Fetscher, Max S. New, Daniel Feltey and Robert Bruce Findler
The paper implements a monad that counts the number of operations during a (would be) program run. The monadic Coq program can afterwards be converted to idiomatic OCaml code. The operation counts however can be used to reason about program performance: the paper proves that red-black tree insertion and search, several sorting algorithms, Okasaki's Braun Tree algorithms among others, all have their expected running times.
Space-efficient Acyclicity Constraints: A Declarative Pearl by Taus Brock-Nannestad
This is truly a pearl: of the encoding of acyclicity of a graph in terms of a local constraint -- surprisingly, with only one bit of information per node. Originally presented for planar graphs, the approach is extended to graphs embeddable on surfaces of higher genus (e.g., torus), and to the encoding the connectedness and other related constraints.
A modular foreign function interface by Jeremy Yallop, David Sheets and Anil Madhavapeddy
The FLOPS' emphasis on the connections between theory and practice should have already been evident. The remaining two papers stress the connection with particular force. Both present widely used libraries. In the papers, the library authors motivate and reflect on their design. First, Jeremy Yallop, David Sheets and Anil Madhavapeddy describe the modern foreign function (FFI) interface for OCaml, stressing the benefit of declarative specification and ML-style modules. The same FFI specifications can be used to generate static, dynamic, synchronous and asynchronous binding code.
Boolean Constraints in SWI-Prolog: A Comprehensive System Description by Markus Triska
FLOPS has a long history of encouraging system descriptions as a special category of submissions -- considering extensive, good, useful, publicly available implementation work to be a contribution in itself. Markus Triska's paper is a fitting representative of this category. The paper presents the Boolean symbolic constraint solver implemented in pure Prolog, explaining the core algorithms and implementation trade-offs. Rather than relying on SAT, the library performs symbolic BDD-like manipulation of systems of Boolean constraints, arguably more in the spirit of the constraint logic programming model.

We are grateful to the authors for preparing the extended versions of their papers and promptly addressing reviewers' comments. We deeply thank the reviewers for all the work they have done to evaluate and improve the submissions.

Joint editing with Andy King, University of Kent, UK

Science of Computer Programming, volume 164, 15 October 2018
Special Issue of selected papers from FLOPS 2016

FLOPS-poster.pdf [125K]
The conference poster


Konrad Zuse as the inventor of predicated execution

Konrad Zuse's Z3 was far ahead of its time. Unlike ENIAC, it was controlled by a software program. In Z3, the numbers were represented in a floating point format greatly resembling the one in the IEEE Floating-Point standard. Z3 had two ALUs so it operated on the exponent and the mantissa of a number in parallel.

It appears Z3 was farther ahead than we might have realized. Z3 had no branch instruction and instead relied on predicated execution. The latter technique -- implemented in the most recent upscale CPUs -- is considered to be ``one of the most significant [recent] advances in the history of computer architecture and compiler design.'' It seems that advance is not recent at all: it was invented by Konrad Zuse back in 1938. The letter to Dr.Dobbs J. Editors gives more details.

Raul Rojas: Simulating Konrad Zuse's Computers
Dr. Dobbs J., September 2000, pp. 64-69

Zuse-accolades.txt [6K]
The letter to the editor regarding the above article. It was published in Dr. Dobbs J., December 2000, p. 10

Zuse and Intel: A letter from reader Ben Laurie
Dr. Dobbs J., August 2001, p. 12
Ben Lauri points out that Acorn's ARM chip has had the predicated execution feature more than a decade ago (but far after Z3)

Konrad Zuse Internet Archive


Programming and Reasoning with Algebraic Effects and Effect Handlers

The Shonan Seminar 146 on algebraic effects and handlers took place at the Shonan Village Center, Kanagawa, Japan on March 25--29, 2019.

Algebraic effects and effect handlers are becoming an increasingly popular approach for expressing and composing computational effects. There are implementations of algebraic effects and effect handlers in Clojure, F#, Haskell, Idris, Javascript, OCaml PureScript, Racket, Scala, Scheme, Standard ML, and even C. There are full-fledged languages built around effects such as Eff, Frank, Links, Koka, and Multicore OCaml. Moreover, there is growing interest from industry in effect handlers. For instance, Facebook's React library for JavaScript UI programming is directly inspired by effect handlers, and Uber's recently released Pyro tool for probabilistic programming and deep learning relies on effect handlers. Such interest arises from the ease of combining in the same program independently developed components using algebraic effects and effect handlers.

The increased adoption and use of algebraic effects and effect handlers reveal and make pressing three main problems: reasoning, performance, and typing. Given the complexity of these problems and their importance, we believed the face-to-face meeting of main community representatives would promote their solution.

We identified five specific application areas to be discussed at the meeting in the context of the three main problem areas:

To promote mutual understanding, we have planned for the workshop to have substantial time available for discussion. We emphasized tutorials, brainstorming, and working-group sessions, rather than mere conference-like presentations.

The seminar was organized together with Sam Lindley, Gordon Plotkin and Nicolas Wu. The Shonan seminar series is sponsored by Japan's National Institute of Informatics (NII).

The web pages of the seminar

shonan-handlers-report.pdf [221K]
The final report


Self-referential values and back pointers in strict and lazy pure functional languages

Henry Baker remarked that ``it is impossible to create infinite trees -- i.e., directed cycles -- without side-effects.'' This article shows limitations of that statement. One can construct cyclic lists and other self-referential values in a pure functional style, without resorting to mutations. Lazy evaluation -- either supported by a language, or emulated -- makes this possible. Lazy evaluation even allows an invocation of a function with an argument that refers to the function's result.

A follow-up article showed various ways to simulate self-referential values and lazy computation in ML (OCaml).

Furthermore, we can create DAGs and general cyclic data structures even in a strict pure functional language. The following article ``On parent pointers in SXML trees'' demonstrates different ways of constructing tree-like data structures with forward and backward pointers. Some of the data structures are general graphs with directed cycles, some of them are DAGs. And yet some of them are genuine trees. How can a recursive data structure have forward and backward pointers and yet only a single path between any two nodes? How can we do that in a strict pure language? The article and the accompanying source code explain. The paradoxical tree is built by a procedure add-thunk-parent-pt, which is defined in section 'Solution 3' of the source code. The procedure is deceptively simple. It took 21 lines of Scheme code to implement the algorithm and 26 (full-length) lines of comments to explain it.

The current version is 1.1, February 14, 2000
functional-selfref.txt [3K]
The USENET article Self-referential values and expressiveness of strict vs. non-strict languages
It was posted on the newsgroups comp.lang.scheme and comp.lang.functional on Mon Feb 14 20:14:49 2000 GMT

Unitary type as a self-referential set

Pure-functional transformations of cyclic graphs and the Credit Card Transform

Henry G. Baker: Equal Rights for Functional Objects or, The More Things Change, The More They Are the Same

On parent pointers in SXML trees
The article also explains practical benefits of creating and manipulating cyclical structures in the side-effect-free manner.


Makefile as a functional language program

Although it looks like an idle musing, this (mis)application of Makefiles was motivated by a practical problem and was actually used in practice. The problem was the explosion of makefile rules in a project that executes many test cases on many platforms. Each test target is built from its own set of source code files. Each (Scheme) platform has its own particular way of assembling source code files into a project; some source code files might need be omitted from the build for a particular platform. Because GNU make turns out to be a functional programming system, we managed to reduce the number of rules from number-of-targets times number-of-platforms to just number-of-targets plus number-of-platforms.

The language of GNU make is indeed functional, complete with combinators (map and filter), applications and anonymous abstractions. Yes, GNU make supports lambda-abstractions. The following is one example from the Makefile in question: it is a rule to build a test target for the SCM Scheme system. The list of source code files and the name of the target/root-test-file are passed as two arguments of the rule:

    make-scmi= scm -b -l $(LIBDIR)/myenv-scm.scm \
               $(foreach file,$(1),-l $(LIBDIR)/$(file)) \
               -l $(2).scm
The rule returns the OS command to interpret or compile the target. It is to be invoked as
    $(call make-scmi,util.scm catch-error.scm,vmyenv)
As in TeX, the arguments of a function are numbered (it is possible to assign them meaningful symbolic names, too). Makefile's foreach corresponds to Scheme's map. The comparison with the corresponding Scheme code is striking:
    (define make-scmi
       (lambda (arg1 arg2)
            `(scm -b -l ,(mks LIBDIR '/ 'myenv-scm.scm)
               ,@(map (lambda (file) `(-l ,(mks LIBDIR '/ file))) arg1)
              -l ,(mks arg2 '.scm))))
The current version is 4.6, October 30, 2003
Make-functional.txt [3K]
The message describing the motivation for the functional Makefile to automate regression testing, and a few examples. It was posted on the SSAX-SXML mailing list on Mon, 18 Nov 2002 12:48:41 -0800

Makefile [7K]
The Makefile in question

Chapter 'Functions' from make info pages


Axiom of Choice as a black-box cursor-like Set API

The following short essay argues that the Axiom of Choice can be regarded as a deconstructor (i.e., cursor-like) API for a set. In contrast, axioms of (restricted) comprehension and of replacement are more like the enumerator API for a set. The Axiom of Choice does not reveal how the set in question has been constructed. The Axiom of Choice may also be viewed as an assertion that set's membership predicate is safe, in Datalog sense.

In a more recent discussion, Neelakantan Krishnaswami introduced the function choose function as an indispensable part of an API for Set, letting us reason about sets algebraically and equationally.

The current version is December 23, 2004
axiom-of-choice.txt [5K]
The essay

Neelakantan Krishnaswami: Re: FP-style map over set
The message posted on the newsgroup comp.lang.functional on Mon, 28 Nov 2005 23:45:27 +0000 (UTC)


Sendmail as a Turing machine

This article proves by construction that sendmail rewriting rules are Turing-complete. They are expressive enough to implement a Turing machine. The article first gives an introduction to the rewriting rules, for easy reference. We then explain how to map Turing machine configuration (the content of the tape and the current position) to a character string, and machine's finite control to rewriting rules. As an example, we demonstrate two Turing machines: an adder of two natural numbers, and an even/odd decision function. We explain how to run these examples and see the result.
The current version is September 10, 2000
sendmail-as-turing-machine.txt [11K]
The article Sendmail as a Turing machine, posted on the newsgroups comp.lang.functional and comp.mail.sendmail on Sun, 10 Sep 2000 22:26:33 GMT [37K]
The sendmail configuration file ( augmented with rules to implement the Turing machine. This file also defines two sample Turing machines.


Turing Machine in Standard ML

The following archive contains a simple implementation of Turing machine in Standard ML (SML/NJ). For illustration, the code includes three sample Turing programs: the successor function, a character substitution function, and a decision function. For example,
    - turing_machine((0,["#","a","a","a","a","a","a"],"#",[]),prog_decision);
    val it = (~1,["#","Yes"],"#",["#","#","#","#","#"])
             : int * string list * string * string list
shows that starting the decision program (which decides if a string contains an even number of characters) with the tape #aaaaaa###.... and the head positioned at the beginning of the tape, leaves # "YES" ####... on the tape after the machine halts (by reaching the state -1). Thus, the machine has decided the input string contained an even number of characters.
The current version is August 5, 1992
turing-sml.tar.gz [3K]
The source code and the transcript of its evaluation