previous   next   up   top

HANSEI as a Declarative Logic Programming Language




Hansei is a probabilistic programming language embedded in OCaml, an OCaml library. It has not been intended for logic programming. It provides no logic variables or unification. Furthermore, tracking and accumulating probabilities is overkill for logic programming.

An inspiring conversation with Mikael More brought up unthought-of questions. What if we do try to use Hansei to solve typical logic programming puzzles? Can we even do it? Can we do it efficiently? How much of logic programming will remain? In particular, can we read the Hansei code as a clear, declarative statement of the problem, without being distracted by the details of the solution? This present web page documents the attempts to answer all these questions. Hansei turns out suitable for logic programming after all.

We discuss logic programming without logic variables in a separate article below. Here we argue why one might want to consider Hansei when solving a typical logic programming problem.

Hansei adheres to declarative programming, by separating building probabilistic models from inference. A model describes the properties of a possible world and defines computations and constraints on them. Inference finds possible worlds satisfying the properties and the constraints. Building models and writing inference (search) procedures are distinct activities with distinct abstractions; writing custom inference procedures is rare since the ones in Hansei often suffice.

Many uses of cuts, negation and committed choice in Prolog are a way of telling that certain predicate is (semi-)deterministic; in other words, the underlying relation is functional. In Hansei, it is the determinism that is the default; deterministic computations run `at full speed' of OCaml, with no overhead. Hansei is thus particularly suitable for logic programs with a large portion of deterministic, especially numeric, computations. Hansei, as underlying OCaml, is typed. Types are notably useful in logic programming, to avoid a typical Prolog problem, of a mis-spelled constant or variable leading to an incorrect answer rather than an error message. Without types, such errors are frustratingly difficult to find.

Probability estimations turn out useful too: if we prematurely terminate a Hansei search, we obtain an estimate of the probability of missing a solution. Assigning different weights to different choices (that is, using non-uniform probability distributions) is a way to supply heuristics to bias the search.

Except for probability, the arguments for Hansei are the arguments for functional logic programming, equally applicable to the languages like Curry. The strong case for functional-logic programming has been made elsewhere, see in particular surveys by Michael Hanus. The advantage of Hansei is that it is not a separate language: it is just a user-extensible OCaml library. The arguments for Hansei also apply to embeddings of non-determinism in a typed functional language (for example, MonadPlus in Haskell), provided the crucial feature of lazy sharing is supported.

The current version is November 2010.
Michael Hanus: Multi-paradigm Declarative Languages
A survey. ICLP 2007, Springer LNCS 4670.
< >

HANSEI: Embedded Probabilistic Programming

Purely Functional Lazy Non-deterministic Programming
The paper describes the importance, the equational theory and the implementation of lazy sharing. The implementation of letlazy in Hansei is based on the technique explained in the paper.


Guess Lazily! Making a program guess, and guess well

[The Abstract of the talk]
Guessing is a part of life and science. We form a hypothesis, work out the consequences and compare with observations. Lots of problems are formulated by first assuming that the solution exists and then describing the properties it should have. Planning, scheduling, diagnostic, learning problems and Sudoku all follow this pattern. Guessing is good not only for describing these problems but also for solving them. We make a guess -- often a series of guesses -- to build, for example, a schedule, and check if it satisfies resource, timeliness and other constraints. Often, we guess again.

How do we write ``guess the value of this variable'' in code? How do we code ``guess again''? How to put in prior knowledge favoring some guesses? The talk first will answer these questions.

Naive guessing however is hopeless even for toy problems. We often have to make lots of guesses before getting a candidate solution to check against the constraints. Only a tiny or even infinitesimal proportion of these guesses yield a successful candidate. How to make good guesses? That is very hard to know: Most physical, biological, sociological, etc. models are set up to compute the consequences of causes rather than the other way around. It helps to reformulate the question: how to avoid too many bad guesses? The talk will describe and illustrate a general principle, found in any serious logic, non-deterministic or probabilistic programming system.

The techniques explained in the talk are not tied to any language and can be used even in C. However, functional, especially typed functional languages have a serious advantage, as we shall see. No prior knowledge of logic or non-deterministic programming is required. The ability to read introductory OCaml or Haskell code will be helpful. The participants will learn how to guess in their favorite language, and what it takes to succeed. They will see laziness, unification and constraint propagation in the same light.

There are several programming languages build around non-determinism: standalone -- such as Icon, Prolog, Curry -- or deeply embedded like Kanren. The talk is not about them. It is about your language. Although the talk uses OCaml for concreteness, the main points are not OCaml-specific and have been realized in other languages. We will not be talking about implementing a dialect of Prolog in your language. Rather, we argue for using non-determinism in your language directly. There is no new language to learn or a strict convention to follow. You write code as you normally do; you call any function in any library as usual -- without any wrappers and without pondering how to turn that third-party function into a relation or to make it reversible. Likewise, you can give non-deterministic callbacks to unsuspecting third-party functions.

Does this `direct guessing' measure up to bona fide logic programming? After all, your language, like OCaml, may have no logic variables, unification, or constraint solvers. The talk -- and this page -- describe several classical examples of logic programming, all implemented in OCaml, with Hansei library. You can do logic programming in your language. All you need is fork(2) -- a form of clonable threads with a backtrackable (world-local) state.

The current version is September 2012.
StrangeLoop.pdf [167K]
Annotated slides of the talk presented at the Strange Loop conference. St Louis, MO, U.S.A. September 25, 2012

Tim Menzies, David Owen and Julian Richardson: ``The Strangest Thing About Software''
IEEE Computer, January 2007, pp. 54-60.
< >
Abstract: ``Although there are times when random search is dangerous and should be avoided, software analysis should start with random methods because they are so cheap, moving to the more complex methods only when random methods fail.''
The article makes the compelling case for guessing as a way of solving problems. It details two examples, scheduling and finite-model checking, where random search turns out faster than sophisticated methods. These examples are not accidental, the authors argue. [6K]
The code for the puzzle at the beginning of the talk

`Reversible' parser combinators
The second example in the talk [10K]
The code for the reversible type inference, the last example in the talk


Non-deterministic choice in a conventional programming language: Enough for logic programming?

[The Abstract of the talk]
Logic programming is a fascinating approach, especially for AI and natural language processing. It is greatly appealing to declaratively state the properties of the problem and let the system find the solution. Most intriguing is the ability to run programs `forwards' and `backwards'.

However, the built-in search methods of logic programming systems don't fit all problems and hardly if at all customizable. Mainly, quite many computations and models are mostly deterministic. Implementing them in a logic programming language is significantly inefficient and requires extensive use of problematic features such as cut. Another problem is interfacing logic programs with mainstream language libraries: if mode analysis is not available (as is often the case), one has to live with run-time instantiatedness errors.

An alternative to logic programming, where non-determinism is default, is a deterministic programming system (such as Scheme, OCaml, Scala or Haskell -- or even C) with (probabilistic) non-determinism as an option. Is this a good alternative? We explore this question. We will use Hansei -- a probabilistic programming system implemented as a library in OCaml -- to solve a number of classic logic programming problems, from zebra to scheduling, to parser combinators, to reversible type checking.

The present talk shares the motivation of the Strange Loop talk but not much of the material. Rather, the present talk illustrates strong and weak points of Prolog and contrasts Prolog and Hansei on a simple representative example of logic programming.

NII2012.pdf [198K]
Annotated slides of the presentation at the 55th Tokyo Programming Seminar (ToPS) at the National Institute of Informatics, Japan. December 18, 2012. [8K]
The complete source code for the Hansei examples in the first part of the talk.


No logic variables? No unification?

Logic variables and accompanying unification are such salient features of popular logic programming languages that logic programming seems inconceivable without them. The story of programming in logic without logic variables is dramatic, filled with theoretically inspired hopes dashed in practice. We show that although theoretical advances and improvements in the Hansei implementation are welcome, Hansei as it is is already capable of expressing the standard Prolog idioms, retaining their spirit. We can indeed do proper logic programming in Hansei: declaring relations and asking the system to find their model.

Logic programming without logic variables should be obvious. The minimal model semantics of Prolog needs no logic variables; after all, Herbrand basis is by definition a set of ground atoms. The fixpoint construction (iterating the immediate consequence operator) gives the algorithm of finding the minimal model with no use for variables or unification. The reality dims the hope: although the fixpoint construction is viable for definite Datalog programs, it is difficult to perform when the Herbrand basis is infinite and the logic program contains negation and committed choice.

Two recent theoretical results in functional logic programming, Antoy and Hanus (2006) and Dios Castro and Lopez-Fraguas (2006), point out that logic variables and non-determinism due to overlapping rules have the same computational power. The main idea is representing a logic variable by a generator of its domain, with sharing of the generator expressing repeated occurrences of the variable. The idea is intuitively clear: a logic variable indeed stands for some term, non-deterministically chosen from a suitable domain. The papers derive a systematic transformation of a program with logic variables into the one without, while preserving the set of derivations. Alas, preserving the set of derivations (reductions) is a rather weak notion of program equivalence. For example, it does not account for the multiplicity of results, considering a program that yields, say, a single 0 equivalent to a program yielding the infinite sequence of 0s. In logic programming practice, we usually do distinguish such programs, valuing programs that produce a finite set of answers in finite time. The other drawback of the theoretical frameworks used in both papers is the inability to reason about finite failure. A logic program that returns "NO" and a program that loops forever are regarded as equivalent in these frameworks. In practice we consider such programs as very different. Since the papers are written as theoretical, the details about sharing and fairness of generators are sketchy.

Aware of the theoretical hopes and practical shortcomings, we decided to press ahead and see just how far we can go expressing standard Prolog idioms in a language without logic variables. Pending adequate theoretical analysis, empirical case studies seem the only choice. The present web page documents our results. The somewhat unexpected conclusion is that we go quite far. Hansei as it is appears suitable for logic programming.

We have gone back to Herbrand: we build the Herbrand universe (the set of all ground terms) and explore it to find a model of a program (that is, tuples of terms that model all declared relations). We build the universe by modeling `logic variables' as generators for their domains. It greatly helps that Hansei is typed; the type of a `logic variable' specifies its domain and, hence, the generator. Since the Herbrand universe for most logic programs is infinite, we need non-strict evaluation. Furthermore, since logic variables may occur several times, we must be able to correlate the generators. Although Hansei is embedded in OCaml, which is strict, Hansei supports on-demand evaluated non-deterministic computations with sharing: so-called letlazy computations. Finally, since the search space is generally infinite, we need a systematic way of exploring it, without getting stuck in one infinite sub-region. Again Hansei provides a number of search strategies, including sampling and iterative deepening. (We shall use the latter in all our examples.)

Logic variables and unification have been introduced by Robinson as a way to `lift' ground resolution proofs of Herbrand, to avoid generating the vast number of ground terms. Logic variables effectively delay the generation of ground terms to the last possible moment and to the least extent. Doing computations only as far as needed is also the goal of lazy evaluation. It appears that lazy evaluation can make up for logic variables, rendering Herbrand's original approach practical.

The current version is December 2010.
Portoraro, Frederic: Automated Reasoning
The Stanford Encyclopedia of Philosophy (Winter 2010 Edition), Edward N. Zalta (ed.)
< >
Section 2.1 ``Resolution'' describes Robinson's idea of unification, which made Herbrand's approach practical. We avoid unification, using lazy evaluation rather than logic variables to improve the efficiency of the original Herbrand approach.

John Alan Robinson: ``Computational Logic: Memories of the Past and Challenges for the Future''
Proceedings of the First International Conference on Computational Logic, LNAI 1861, pp. 1-24, 2000.
John Alan Robinson reminisces about Herbrand, unification and the discovery of resolution.

Sergio Antoy and Michael Hanus: Overlapping rules and logic variables in Functional Logic Programs
ICLP 2006.

J. de Dios Castro and F.J. Lopez-Fraguas: Elimination of Extra Variables in Functional Logic Programs
Proc. V Jornadas sobre Programacion y Lenguajes (PROLE'06), pp. 121-135, 2006.


The append relation

We introduce logic programming in Hansei on the classical example of the appendrelation. Append relates three lists l1, l2 and l3 such that l3 is the concatenation of l1 and l2. All throughout we take our lists to be lists of booleans. We will contrast Prolog code with Hansei code, developing both incrementally and interactively. The responses of Prolog resp. OCaml top-level are indented.

In Prolog, the append relation is stated as:

     append([H|T],L,[H|R]) :- append(T,L,R).
declaring that an empty list is a prefix of any list, a list is a suffix of itself, and prepending to the prefix of a list prepends to the list. Certainly append can concatenate two lists:
     ?- append([t,t,t],[f,f],X).
          X = [t, t, t, f, f].
By passing a free logic variable as one of the two first arguments, we can concatenate not fully-determined lists:
     ?- append([t,t,t],X,R).
          R = [t, t, t|X].
We see the single result, which stands for every list with the prefix [t,t,t]. Such a compact representation of an infinite set is an asset. Alas, it is not always available, and is often a mirage. For example, if we pass a free variable as the first argument of append, Prolog responds differently:
     ?- append(X,[f,f],R).
          R = [f, f] ;
          R = [_G328, f, f] ;
          R = [_G328, _G334, f, f] ;
          R = [_G328, _G334, _G340, f, f] . 
with an infinite set of answers, to be enumerated indefinitely. It may not be fully apparent that our Prolog code has not faithfully represented the original problem: to relate boolean lists. The last two answers describe lists of more than mere booleans. We have to impose the restriction, by defining the type predicates
     boollist([H|T]) :- bool(H), boollist(T).
and re-writing our queries:
     ?- append([t,t,t],X,R), boollist(X), boollist(R).
          R = [t, t, t] ;
          R = [t, t, t, t] ;
          R = [t, t, t, t, t] ;
          R = [t, t, t, t, t, t] ;
          R = [t, t, t, t, t, t, t, t, t] .
     ?- append(X,[f,f],R), boollist(X), boollist(R).
          R = [f, f] ;
          R = [t, f, f] ;
          R = [f, f, f] . 
One of boollist(X) or boollist(R) would have been enough: if X is a boolean list, so is R. Alas, Prolog has no type inference and is unable to infer or take advantage of that fact. To be safe, we add both predicates. The compact representation for the lists with the [t, t, t] prefix has disappeared. More seriously, the default depth-first search strategy of Prolog gives us only half of the answers; we won't ever see the lists with an f after the first three t.

We switch to Hansei. In a typed language, types -- the specification of the problem -- come first. We start by defining the type blist of boolean lists with a non-deterministic spine, along with the convenient constructors:

     type bl = Nil | Cons of bool * blist
     and blist = unit -> bl;;
     let nil  : blist = fun () -> Nil;;
     let cons : bool -> blist -> blist = fun h t () -> Cons (h,t);;
and the conversion to ordinary, fully deterministic lists, to see the results.
     let rec list_of_blist bl = match bl () with
       | Nil        -> []
       | Cons (h,t) -> h:: list_of_blist t;;
         val list_of_blist : blist -> bool list = <fun>

We now define append, seemingly as a function, in the fully standard, also declarative, way.

     let rec append l1 l2 = match l1 () with
       | Nil        -> l2
       | Cons (h,t) -> cons h (fun () -> append t l2 ());;
         val append : blist -> blist -> blist = <fun>

An attempt to concatenate two sample lists:

     let t3 = cons true (cons true (cons true nil));;
     let f2 = cons false (cons false nil);;
     append t3 f2;;
         - : blist = <fun>

turns out not informative. Recall that we use the Hansei library to build a probabilistic model, which we then have to run. Running the model determines the set of possible worlds consistent with the probabilistic model: the model of the model. The set of model outputs in these worlds is the set of answers. Hansei offers a number of ways to run models and obtain the answers and their weights. We will be using iterative deepening:

     val reify_part : int option -> (unit -> 'a) -> (Ptypes.prob * 'a) list
The first argument is the depth search bound (infinite, if None).
     reify_part None (fun () -> list_of_blist (append t3 f2));;
         - : (Ptypes.prob * bool list) list = [(1., [true; true; true; false; false])]
Running our first append model has given the expected result, with the expected weight. Following along the Prolog code, we turn to appending to an indeterminate list.

Indeterminate lists are represented by generators. Hence, we need generators for bool and bool list domains. The generators must use letlazy, to delay the generation until the value is needed; once the value has been determined, it stays the same.

     let a_boolean () = letlazy (fun () -> flip 0.5);;
         val a_boolean : unit -> unit -> bool = <fun>
     let rec a_blist () = letlazy (fun () -> dist [(0.5, Nil); (0.5, Cons(flip 0.5, a_blist ()))]);;
         val a_blist : unit -> blist = <fun>

Let us see a sample of generated values, as a spot-check:

     reify_part (Some 5) (fun() -> list_of_blist (a_blist ()));;
         - : (Ptypes.prob * bool list) list =
           [(0.5,     []); 
            (0.125,   [false]); 
            (0.03125, [false; false]);
            (0.03125, [false; true]); 
            (0.125,   [true]);
            (0.03125, [true; false]);
            (0.03125, [true; true])]
The lists of different lengths have different probabilities, or different likelihoods of encountering them. We could use the accumulated probability mass as the measure of the explored search space, terminating the search once a threshold is reached. One is reminded of probabilistic algorithms such as the Rabin-Miller primality test.

We are now ready to reproduce the Prolog code, passing a `free' variable as an argument to append and obtaining a sequence of lists with the given prefix or suffix. Unlike Prolog, we are not stuck in a corner of the search space. We generate all lists with the given prefix [true; true; true].

     reify_part (Some 5) (fun() ->
      let x = a_blist () in
      list_of_blist (append t3 x));;
         - : (Ptypes.prob * bool list) list =
     	 [(0.5,     [true; true; true]); 
     	  (0.125,   [true; true; true; false]);
     	  (0.03125, [true; true; true; false; false]);
     	  (0.03125, [true; true; true; false; true]);
     	  (0.125,   [true; true; true; true]);
     	  (0.03125, [true; true; true; true; false]);
     	  (0.03125, [true; true; true; true; true])]
     reify_part (Some 5) (fun() ->
      let x = a_blist () in
      list_of_blist (append x f2));;
         - : (Ptypes.prob * bool list) list =
     	 [(0.5,     [false; false]); 
     	  (0.125,   [false; false; false]);
     	  (0.03125, [false; false; false; false]);
     	  (0.03125, [false; true; false; false]); 
     	  (0.125,   [true; false; false]);
     	  (0.03125, [true; false; false; false]);
     	  (0.03125, [true; true; false; false])]

Prolog's append is so elegant because it defines a relation among three lists. One can specify any two lists and query for the other one that makes the relation hold. For example, we check if a given list has a given prefix, and if so, remove it:

     ?- append([t,t],X,[t,t,t,f,f]).
         X = [t, f, f].

We can do the same in Hansei, effectively `running append backwards.' We introduce a sample list to deconstruct and a function two compare two lists. It is an ordinary function, not a built-in.

     let t3f2 = append t3 f2;;
     let rec bl_compare l1 l2 = match (l1 (), l2 ()) with
      | (Nil,Nil)  -> true
      | (Cons (h1,t1), Cons (h2,t2)) -> h1 = h2 && bl_compare t1 t2
      | _          -> false;;
         val bl_compare : blist -> blist -> bool = <fun>
The prefix-removal example is as follows. In words: we assert that there exists a boolean list x such that appending it to [true; true] yields the list t3f2. We ask Hansei to find all possible worlds in which the assertion holds and report the value of x in these worlds.
     reify_part None (fun() ->
      let x = a_blist () in
      let r = append (cons true (cons true nil)) x in
      if not (bl_compare r t3f2) then fail ();
      list_of_blist x);;
         - : (Ptypes.prob * bool list) list = [(0.0078125, [true; false; false])]
Since the search bound was None, the returned single result came from the exhaustive enumeration of the entire search space. The search space is indeed infinite, but amenable to efficient exploration because most of the possible worlds are inconsistent with the assertion and could be rejected wholesale, without generating them. For example, when comparing an indeterminate list to an empty list, bl_compare returns the result after forcing a single choice. One can tell if a list is empty by checking for the first Cons cell. One does not need to know how many else elements there may be in the list. In general, if the lists are not equal, it can be determined by examining their finite prefix. Furthermore, if one of bl_compare's arguments has a fixed length (as is the case in our example: t3f2 is the list of 5 elements), the comparison finishes after forcing finitely many choices. The ability of bl_compare to deal with partially determined lists relates it to unification: unifying [X|Y] with the empty list fails without the need to instantiate X or Y, in effect performing the comparison with the infinite set of concrete lists.

We can likewise use append to check for, and remove, a given suffix. We can also split a given list in all possible ways, returning its prefixes and suffixes. If the list is finite, we obtain the finite number of answers, in Prolog:

     ?- append(X,Y,[t,t,t,f,f]).
         X = [],  Y = [t, t, t, f, f] ;
         X = [t], Y = [t, t, f, f] ;
         X = [t, t], Y = [t, f, f] ;
         X = [t, t, t], Y = [f, f] ;
         X = [t, t, t, f], Y = [f] ;
         X = [t, t, t, f, f], Y = [] ;
and, similarly, in Hansei:
     reify_part None (fun() ->
      let x = a_blist () in
      let y = a_blist () in
      let r = append x y in
      if not (bl_compare r t3f2) then fail ();
      (list_of_blist x, list_of_blist y));;
         - : (Ptypes.prob * (bool list * bool list)) list =
     	 [(0.000244140625, ([], [true; true; true; false; false]));
     	  (0.000244140625, ([true], [true; true; false; false]));
     	  (0.000244140625, ([true; true], [true; false; false]));
     	  (0.000244140625, ([true; true; true], [false; false]));
     	  (0.000244140625, ([true; true; true; false], [false]));
     	  (0.000244140625, ([true; true; true; false; false], []))]
The Hansei code follows the demonstrated pattern, of building a probabilistic model returning a pair of two `random' lists x and y, conditioned upon the asserted evidence that their concatenation is t3f2. In the above code, the OCaml variable x is used twice. Although it is bound to a generator, its two use occurrences refer to the same generated value -- thanks to letlazy in the definition of a_blist.

Our final example is last, relating a list to its last element. We contrast Prolog code

     last(E,L) :- append(_,[E],L).
     ?- last(E,[t,t,f]).
     E = f ;

to Hansei code:

     let last bl = 
      let x = a_blist () in
      let e = a_boolean () in
      if not (bl_compare (append x (cons (e ()) nil)) bl) then fail ();
         val last : blist -> unit -> bool = <fun>
     reify_part None (fun() -> last t3f2 ());;
         - : (Ptypes.prob * bool) list = [(0.0009765625, false)]

There are still open questions. The following code produces the result with no restrictions on the search space:

     reify_part None (fun () ->
      let x = a_blist () in
      let y = a_blist () in
      if not (bl_compare x t3f2) then fail ();
      if not (bl_compare x y) then fail ();
      list_of_blist y);;
         - : (Ptypes.prob * bool list) list = [(2.384185791015625e-07, [true; true; true; false; false])]

A slightly modified code, with (bl_compare x y) first, although gives the same result, requires the restriction on the search depth. The search space can no longer be efficiently enumerated. Can we automatically reorder the comparison statements, to improve the search? Another open problem is the `occurs check.' Its absence in Hansei again necessitates the restriction on the search space. None of the two open issues are fatal: restricting the search by depth or by the remaining probability mass (effectively, by the probability of missing a solution if one exists) is always possible and even natural. It is nevertheless interesting to see if the occurs check and the unification of two unbound variables have a clear analogue in the variable-less case.

The current version is December 2010.
References [8K]
The complete source code for the example.


Solving scheduling problems

Scheduling problems are among the clearest applications of logic programming. It is appealing just to state constraints and ask the system to find a feasible schedule satisfying the constraints. A simple scheduling problem posed by Mikael More is our next example of using Hansei for logic programming. We shall indeed state the constraints (as the `evidence') and ask Hansei to find the possible worlds consistent with that evidence. Here is the problem:
For the daily schedule for Monday to Wednesday:
  • One of the days I'll shop;
  • One of the days I'll take a walk;
  • One of the days I'll go to the barber;
  • One of the days I'll go to the supermarket;
  • The same day as I go to the supermarket, I'll shop;
  • The same day as I take a walk I'll go to the barber;
  • I'll go to the supermarket the day before the day I'll take a walk;
  • I'll take a walk Tuesday.
Which are all possible daily schedules, regarding those four events?
We take the phrase `One of the days I'll shop.' to mean that `I shop only on one day out of the three.' Otherwise, there are too many solutions: for example, I can do all the activities on each day. We develop our examples interactively, by submitting expressions to the top-level of the OCaml interpreter after loading the Hansei library. In the transcript below, the OCaml top-level responses are indented.

The gist of the problem is selecting a subset of actions to do on one day from the set of possible actions. In other words, we need to sample from a powerset. Such a sampling function could have been provided by Hansei. It is however easy and illustrative to define from scratch:

     open ProbM;;
     let powerset lst = List.fold_left (fun acc e -> if flip 0.5 then e::acc else acc) [] lst;;
         val powerset : 'a list -> 'a list = <fun>
In words: to choose a subset of the set lst we decide, randomly and independently, for each element of lst whether to take it or not. We use OCaml lists to represent sets. To spot-check our sampling procedure we compute the probability table for all samples from powerset [1;2;3]:
     exact_reify (fun () -> powerset [1;2;3]);;
         - : int list Ptypes.pV =
             [(0.125, Ptypes.V [3; 2; 1]); 
              (0.125, Ptypes.V [3; 2]);
              (0.125, Ptypes.V [3; 1]); 
              (0.125, Ptypes.V [3]); 
              (0.125, Ptypes.V [2; 1]);
              (0.125, Ptypes.V [2]);
              (0.125, Ptypes.V [1]);
              (0.125, Ptypes.V [])] 
Indeed all subsets are present, with the equal probabilities.

We define another simple but helpful function, which asserts the proposition x in the current world:

     let mustbe x = if not x then fail ();;
         val mustbe : bool -> unit = <fun>

We are ready to encode our problem. We define the datatype enumerating the possible actions:

     type action = Shop | Walk | Barber | Supermarket;;
To define the model we specify the sampling procedure for the actions and enumerate the constraints, following the most straightforward generate-and-test approach.
     let schedule_model () =
      let ndays = 3 in
We determine the set of actions to do on day d. The set is chosen randomly from the set of all possible actions. However, if we ask again which actions were chosen for day d, we should get the same answer. Thus actions d acts as a logic variable: its value, initially indeterminate, once determined, does not change. We use the memo facility of Hansei to define such memoized stochastic functions.
     let actions = memo (fun d -> powerset  [Shop; Walk; Barber; Supermarket]) in
The proposition action_on a d states that on day d I do action a. The proposition only_on a d asserts that I do action a only on day d (and not on any other day):
     let action_on a d = List.mem a (actions d) in
     let only_on a d   = [d] = List.filter (action_on a) [0;1;2] in
We now state the constraints. For example, the first two lines below assert that there exists a day on which I shop. I do not shop on any other day.
     let d = uniform ndays in
     let _ = mustbe (only_on Shop d) in
     let d = uniform ndays in
     let _ = mustbe (only_on Walk d) in
     let d = uniform ndays in
     let _ = mustbe (only_on Barber d) in
     let d = uniform ndays in
     let _ = mustbe (only_on  Supermarket d) in
     (* The same day as I go to the supermarket, I'll shop.
        That is, there is a day that I perform the action Shop and the action Supermarket
     let d = uniform ndays in
     let _ = mustbe (action_on Supermarket d && action_on Shop d) in
     let d = uniform ndays in
     let _ = mustbe (action_on Walk d && action_on Barber d) in
     (* I'll go to the supermarket the day before the day I'll take a walk. *)
     let d = uniform ndays in
     let _ = mustbe (action_on Walk d && d > 0 && action_on Supermarket (d-1)) in
     (* I'll take a walk Tuesday. *)
     let _ = mustbe (action_on Walk 1) in
Finally, we return the schedule: the array of actions for each day. This is the output of the model.
     Array.init ndays actions;;
        val schedule_model : unit -> action list array = <fun>
The model is defined. We now `solve' it, determining if there exists a schedule satisfying all constraints. Hansei's reply is nearly instantaneous.
     exact_reify schedule_model;;
         - : action list array Ptypes.pV =
         [(1.11632658893461385e-07,  Ptypes.V [|[Supermarket; Shop]; [Barber; Walk]; []| ])] 
There is only one schedule satisfying the constraints: on Monday I go to the supermarket and shop, on Tuesday I walk and take a haircut. Hansei has also computed the `probability' of the solution, which is the the estimate of the search space. Each choice in the model corresponds to a possible world; many possible worlds are not consistent with the stated constraints and facts and are rejected. The reported 1.11632658893461385e-07 is the weight of the surviving possible world(s), the one(s) that satisfied all constraints. Since all our choices were uniform, the weight represents one in 1/1.11632658893461385e-07 = 8957952 that many worlds. Since worlds can be rejected wholesale, Hansei did not have to generate all of them. If one fact in a world contradicts the evidence, we do not need to know the other facts; that is, we do not need to make the choices for them.
The current version is November 2010.
References [4K]
The complete source code for the example.


Zebra puzzle

Solving logic puzzles is the best (detractors say, the only) application of logic programming. A characteristic, old, and well-known logic puzzle is `zebra', sometimes attributed to Einstein or Lewis Carroll. Zebra is another scheduling problem, asking for a schedule satisfying given constraints -- in particular, the `all-different' constraint. In ordinary logic programming all_different is a hassle to state and a chore to evaluate; constraint logic programming (sub)systems (e.g., SICStus) typically rely on external constraint solvers, where all-different is built-in. Even without `all-different' the puzzle is hard; the naive generate-and-test approach is too slow; a bit of creativity is required.

Hansei's formulation of the problem is straightforward and declarative: the naive generate-and-test. That was the point, to see how well Hansei copes without the benefit of a creative formulation or fine-tuning. Hansei copes quite well. It helped that deterministic parts of the puzzle more naturally express in a functional language with data structures and pattern-matching (Hansei's host language, OCaml) than in Prolog. We use non-determinism only when we need it. It also greatly helped that the all-different constraint, or sampling all permutations, turns out efficiently, lazily, implementable in Hansei.

The current version is December 2010.
< > [6K]
The complete, commented source code, with every significant line annotated.
Hansei has determined that the puzzle has only one solution, whose weight is 5e-12. The search space is large indeed.

Lazy permutations and all-different constraint


`Reversible' parser combinators

Parsing is an old application of logic programming and the original motivation for Prolog. A grammar written in Prolog relates strings and their parses. It can both generate and recognize: We can obtain parses of a given string -- or, conversely, find all strings generated by the grammar or the strings with a specific parse. All this is possible in Hansei. Furthermore, Hansei lets us build grammars by combining primitive parsers using ordinary functional application and composition, fully employing higher-order functions, types and other benefits of a modern functional language.

The Hansei parser combinator library looks quite like the famous Parsec, by which it is inspired. To be sure, the sort of non-determinism needed for Parsec is easily emulated in many languages -- that's why Parsec has spread so widely. And yet Hansei brings something new: running Parsec parsers not only forwards, parsing a given stream, but also effectively backwards, to generate a stream.

As usual, Hansei parsers digest a stream of characters; the characters do not need to be present all in memory, but can be read on demand. That's why stream is a thunk. A parser takes a stream and returns the parsing result, the result of a semantic action, and the remainder of the stream:

     type stream_v = Eof | Cons of char * stream
     and stream = unit -> stream_v
     type 'a parser = stream -> 'a * stream

The two primitive parsers are pure

     let pure : 'a -> 'a parser = fun x st -> (x,st)
which parses the empty string returning its argument as the parsing result, and p_sat:
     let p_sat : (char -> bool) -> char parser = fun pred st ->
       match st () with
       | Cons (c,st) when pred c -> (c,st)
       | _                       -> fail ()
that checks that the current element of the stream satisfies a given predicate -- and if it does not, reports failure using Hansei's fail. Other parsers are written in terms of the above, for example, the parser of a character
     let p_char : char -> char parser = fun c -> p_sat (fun x -> x = c)

Combinators combine parsers and their semantic actions and express the rules of the grammar:

     val (<*>)   : ('a -> 'b) parser -> 'a parser -> 'b parser
     val (<$>)   : ('a -> 'b) -> 'a parser -> 'b parser 
     val ( *> )  : 'a parser -> 'b parser -> 'b parser
     val ( <* )  : 'a parser -> 'b parser -> 'a parser
     val p_fix   : ('a parser -> 'a parser) -> 'a parser
     val many    : 'a parser -> 'a list parser

For example, (<*>) combines parsers sequentially. The parsers (<$>), ( *> ), ( <* ) are the variations with respect to semantic actions; their types show what they do. Alternative rules in a grammar are expressed with the alternation combinators

     let (<|>) : 'a parser -> 'a parser -> 'a parser = fun p1 p2 st ->
       uniformly [|p1;p2| ] st
     let alt : 'a parser array -> 'a parser = fun pa st ->
       uniformly pa st
which rely on the Hansei operator for non-deterministic choice, of a parser from a given set.

Here is the first example: recognizing palindromes. Since we build a recognizer, the semantic actions do nothing, returning unit. The grammar reads quite like BNF: A palindrome over the two-character alphabet is either the empty string, a single character, or a palindrome flanked on both sides with the same character:

     let pali = p_fix (fun pali ->
       alt [| empty;
              (fun _ -> ()) <$> p_char 'a';
              (fun _ -> ()) <$> p_char 'b';
              p_char 'a' *> pali <* p_char 'a';
              p_char 'b' *> pali <* p_char 'b' | ]

To parse a string according to the grammar -- to `run the parser forwards' -- we convert a string to a stream, apply the parser and check that the stream is fully consumed:

     let run_fwd : 'a parser -> string -> 'a Ptypes.pV = fun p s ->
       exact_reify (fun () ->
         let (v,s) = p (stream_of_string s) in
         if s () <> Eof then fail (); v)
We could just as well parse a file rather than an in-memory string. Evaluating run_fwd pali "abaaba" shows that "abaaba" is a palindrome.

More interestingly, the very same parser can be run effectively backwards, telling us all the strings it parses.

     let run_bwd : int option -> 'a parser -> (unit -> stream) -> ('a * string) Ptypes.pV 
         = fun n p stm -> Inference.explore n (reify0 (fun () ->
             let st = stm () in
             let (v,st') = p st in
             if st' () <> Eof then fail (); 
             (v,string_of_stream st)))
Since the set of such strings can be infinite, we have to bound the search, specifying the inference depth limit as the first argument of run_bwd (None means no limit). For example, to see a few palindromes, we evaluate:
     run_bwd (Some 10) pali (fun () -> stream_over [|'a';'b'| ])
To run a parser effectively backwards, all we need is to give the parser a random stream to parse. The function stream_over generates all streams over a given alphabet -- the infinite number of them. The parser filters the set of streams; the surviving streams are then reported as parseable.

The next example finds all 5-letter palindromes. We add the predicate stream_len st 5, which fails if the stream does not have 5 elements:

     run_bwd None pali (fun () ->
       let st = stream_over [|'a';'b'| ] in
       stream_len st 5; st)
Although the code indeed generates all eight 5-letter palindromes (equally likely), it is quite puzzling. No search limit was imposed, and yet the search terminated, even though stream_over produces the infinite sequence of streams. Since stream_len rejects all streams whose length is not 5, we should get stuck generating longer and longer streams, and rejecting them all. But we don't.

Let's see the code for stream_len and recall the type of the stream:

     type stream_v = Eof | Cons of char * stream
     and  stream   = unit -> stream_v
     let rec stream_len st n = match (st (),n) with
       | (Eof,0)                    -> ()
       | (Cons (_,t),n) when n > 0  -> stream_len t (n-1)
       | _                          -> fail ()

All characters of the stream indeed do not have to be in memory at once: they can be read on-demand, when the thunk, representing the stream, is forced. In stream_len code, this demand is expressed as st (). Forcing the stream brings in a new character, and the thunk for the tail of the stream. Clearly, stream_len st 5 forces no more than 6 thunks. If the 6th thunk yields a Cons, the function fails and never examines the stream any further. The rest of the stream is never demanded, and is never chosen.

There remains one more mystery of run_bwd: in its code, shown earlier, the identifier st is used three times, as if it were the same sequence of characters. However, st is a thunk, choosing the sequence of characters on-demand. Forcing the thunk st () in one place (e.g., inside stream_len) does not guarantee that the evaluation of st () somewhere else would give exactly the same choice. Sharing a procedure for making choices does not mean that the choices themselves are shared. For that reason, the stream generator stream_over employs a trick:

     val letlazy : (unit -> 'a) -> (unit -> 'a)
     let stream_over : char array -> stream = fun ca ->
       let rec loop () =
         if flip 0.5 then Cons (uniformly ca, letlazy loop)
                     else Eof
       in letlazy loop
The function letlazy, which looks from its type as an identity function, takes a thunk and returns a thunk. When the result of letlazy thunk is repeatedly forced, it repeatedly makes the same choice. Such non-deterministic laziness requires first-class, or world-local memory -- which is what Hansei implements. In functional-logic programming, this is called ``call-time choice''. In quantum mechanics, in is called ``wavefunction collapse''. Before we observe a system, for example, before we look at a flipped coin, there could indeed be several choices for the result. After we looked at the coin, all further looks give the same value.

The source code gives another example of reversible parsing: parsing an arithmetic expression and computing its value. The grammar looks like the one shown in nearly every book on compiler construction:

     let digit = (fun c -> Char.code c - Char.code '0') <$> p_sat (fun x -> x >= '0' && x <= '9')
     let num = (List.fold_left (fun acc d -> d + acc * 10) 0) <$> many1 digit
     let rec expression st =
       ( ((fun t1 op t2 -> op t1 t2) <$> term <*> plus_minus <*> expression) 
         <|> term
       ) st
     and term st =
       ( ((fun t1 op t2 -> op t1 t2) <$> factor <*> prod_div <*> term) 
         <|> factor
       ) st
     and factor st =
       ( (p_char '(' *> expression <* p_char ')')
         <|> num
       ) st
The code demonstrates both the parsing of a string and obtaining all valid arithmetic expressions, or expressions that evaluate to 5.

The parsing combinator library demonstrated that reversible parsers are just as possible in Hansei as they are in Prolog. Prolog is a separate language whereas Hansei is just an OCaml library. The parsing combinator library shows off the laziness principle: delaying the choices until the last possible moment, hoping that the moment will never arrive. We have seen how the principle has made the infinite search finite. Less haste, infinitely more speed.

The current version is September 2012.
References [16K]
Hansei code for the combinator library and parsing and unparsing two sample grammars: palindromes and integer expressions

Guess Lazily! Making a program guess, and guess well
The parsing combinator library was the main example in the talk and was discussed in detail.

Maximal munch, committed choice and nested inference


Maximal munch, committed choice and nested inference

Kleene star is an intrinsic operator in regular expressions and is commonly used in EBNF and other grammar formalisms. Just as common is the so-called ``maximal munch'' restriction on the Kleene star, forcing the longest possible match. After reminding why maximal munch is so prevalent, we describe the grave problem it poses for parsers that are meant to be run both forwards and backwards -- that is, to parse a given stream according to the grammar and to generate grammar's language, all streams it recognizes. Maximal munch cuts shorter-match choices and reduces non-determinism -- hence making forward runs faster. On the downside, when running the parser backwards the cut choices mean lost solutions and the (greatly) incomplete language generation. Hansei removes the downside. Parsers built with the Hansei parser combinator library support maximal munch and can be run effectively backwards to generate the complete language, without omissions. Surprisingly, Hansei already had the necessary features, in particular, the nested inference.

What is maximal munch and why it is so common that is hardly ever mentioned? A programming language specification cliche defines the syntax of an identifier as a letter followed by a sequence of letters and digits, or, in the extended BNF,

     identifier ::= letter letter_or_digit*
where *, the Kleene star, denotes zero or more repetitions of letter_or_digit. Thus in a program fragment var1 + var2, var1 and var2 are identifiers. Why not v, va, and var? According to the above grammar every prefix of an identifier is also an identifier. To avoid such conclusions and the need to complicate the grammar, the maximal munch rule is assumed: letter_or_digit* denotes the longest sequence of letters and digits. Without the maximal munch, we would have to write
     identifier ::= letter letter_or_digit* [look-ahead: not letter_or_digit]
which is not only awkward and requires the notation for look-ahead, but is also much less efficient. If * means mere zero or more occurrences, letter_or_digit* on input "var1 " will match the empty string, "a", "ar" and "ar1". Only the last match leads to the successful parse of the identifier, recognizing var1. Maximal munch cuts the irrelevant choices. It has proved so useful that it is rarely explicitly stated when describing grammars.

Maximal munch however all but destroys the reversible parsing, the ability to run the parser forward (as a parser or recognizer) and backward (as a language generator). We illustrate the problem in Prolog. A recognizer in Prolog is a relation between two streams (lists of characters) S and Srem such that Srem is the suffix of S. In a functional language, we would say that a recognizer recognizers the prefix in S, returning the remaining stream as Srem. Here is the recognizer for the character 'a':

The Kleene-star combinator (typically called many) takes as an argument a recognizer and repeats it zero or more times. Without the maximal munch, it looks as follows:
     many0(P,S,Rest) :- call(P,S,Srem), many0(P,Srem,Rest).
Thus many0(charA,S,R) will recognize or generate the prefix of S with zero or more 'a' characters. Thanks to the first clause, many0(P,S,R) always recognizes the empty string. Here is how we recognize a* in the sample input stream [a,a,b]:
     ?- many0(charA,[a,a,b],R).
     R = [a, a, b] ;
     R = [a, b] ;
     R = [b]
and generate the language of a*:
     ?- many0(charA,S,[]).
     S = [] ;
     S = [a] ;
     S = [a, a] ;
     S = [a, a, a] ;
     S = [a, a, a, a] ; ... 

To implement the maximal munch, many should call the argument parser for as long as it succeeds. We need a way to tell if the parser fails or succeeds. To branch on success or failure of a goal, Prolog offers soft-cut. Recall, soft-cut P *-> Q; R is equivalent to the conjunction P, Q if P succeeds at least once. Soft-cut commits to that choice and totally discards R in that case. R is evaluated only when P fails from the outset. Soft-cut lets us write many with maximal munch:

     many(P,S,Rest) :- call(P,S,Srem) *-> many(P,Srem,Rest) ; S = Rest.
Now the the empty string is recognized (i.e., S = Rest) only if the parser P fails. Recognizing a* in the sample input
     ?- many(charA,[a,a,b],R).
     R = [b].

becomes quite more efficient. There is only one choice, for the longest sequence of as. However, attempting to generate the language a*:

     ?- many(charA,S,[]).
leads to an infinite loop. The argument recognizer, charA, when asked to generate, always succeeds. Therefore, the recursion in many never terminates. When running backwards, the recognizer tries to generate, disregarding all other choices, the longest string of as -- the infinite string. Although the empty string is part of the language a*, we fail to generate it.

The appearance of the soft-cut in the definition of many should have already raised the alarm. After all, soft-cut embodies negation, which is required to detect failure and proceed. In fact, negation can be expressed via soft-cut: not(P) :- P *-> fail; true.

The Hansei parser combinator library supports many, which, unlike the one in Prolog, no longer forces the trade-off between efficient parsing and generation. Hansei's many obeys maximal munch and generates the complete language, with no omissions. Hansei lets us have it both ways. Before describing the implementation, we show a few representative examples.

Parser Stream Result
many (p_char 'a') aaaa  uniquely 
2 many (p_char 'a') <*> p_char 'b' b uniquely
3 many (p_char 'a') <* p_char 'a' aaa NO parse
4 many (p_char 'a') <*> many (p_char 'a')) aaa uniquely
5 many ((p_char 'a') <|> (p_char 'b')) ababb uniquely
6 many ((many1 (p_char 'a')) <|> (many1 (p_char 'b'))) aaabab uniquely
7 many ((p_char 'a' <* p_char 'a') <|> p_char 'a') <* p_char 'a' aaa NO parse
8 many (p_char 'a')  random stream  "" "a" "aa" ...
Examples 5-7 show the argument parsers with choices, even overlapping choices as in Example 7. The combinator many (actually, many1 defined as many1 p = p <* many p) may nest. In Example 3, a*a does not recognize "aaa" since the a* munches the entire stream leaving nothing for the parser of the final a. This is the expected behavior under maximal munch. Example 7 shows no parse for the same reason. Incidentally, the example demonstrates that under maximal munch,
     NOT (many (p1 <|> p2) >= many p1 <|> many p2)
where >= is to be understood as language inclusion. This inequality does hold for many without maximal munch (think of (a+b)^n >= a^n + b^n for non-negative a, b and n). Finally in the last example we generate the complete language for a*, including the empty string. That is, many p has the property
     many p (s1 | s2) = many p s1 | many p s2
where | means non-deterministic choice.

To implement the maximal munch in Hansei we need something like soft-cut, the ability to detect a failure and proceed. Hansei has exactly the right tools: reify0 and reflect:

     type 'a vc = V of 'a                          (* leaf *)
                | C of (unit -> 'a pV)             (* unexpanded branch *)
     and  'a pV = (prob * 'a vc) list
     val reify0 : (unit -> 'a) -> 'a pV
     val reflect : 'a pV -> 'a

The primitive reify0 converts a probabilistic computation to a lazy tree of choices 'a pV, whose nodes contain found solutions V of 'a or not-yet-explored branches. The primitive reflect, the inverse of reify0, turns a tree of choices into a probabilistic program that will make those choices. The primitive reify0 is fundamental in Hansei: probabilistic inference is implemented by first reifying a program (the generative model) to the tree of choices and then exploring the tree in various ways. For instance, the full tree traversal corresponds to exact inference.

Soft-cut can also be implemented as a choice-tree traversal, first_success below, which explores the branches looking for the first V leaf. It returns the tree resulting from the exploration, which could be empty if no V leaf was ever found. Soft-cut then is simply

     val first_success : 'a pV -> 'a pV
     let soft_cut : (unit -> 'a) -> ('a -> 'w) -> (unit -> 'w) -> 'w =
       fun p q r ->
         match first_success (reify0 p) with
         | [] -> r ()
         | t  -> q (reflect t)
We write many in terms of the soft-cut as we did in Prolog:
     let many : 'a parser -> 'a list parser = fun p ->
       let rec self st =
         soft_cut (fun () -> p st)           (* check parser's success  *)
                  (fun (v,st) ->             (* continue parsing with p *)
                    let (vs,st) = self st in
                  (fun () -> ([],st))        (* total failure of the parser *)
       in self

The second question is avoiding losing solutions when running the parser ``backwards''. Unlike Prolog, Hansei parsers are functions rather than relations. They take a stream, attempt to recognize its prefix and return the rest of the stream on success. They cannot be run backwards. However, we achieve the same result -- generating the set of parseable streams -- by generating all streams, feeding them to the parser and returning the streams that parsed completely. Since the number of possible streams is generally infinite, we have to generate them lazily, on demand. To ensure completeness -- to avoid losing any solutions -- the parsers should have the property

     many p (s1 | s2) = many p s1 | many p s2
Surprisingly, many p already satisfies it. The trick is laziness in the stream and Hansei's support of nested inference. The primitive reify0 may appear in probabilistic programs -- in other words, a probabilistic model may itself perform inference, over an inner model. In order for this to work correctly, we had to ensure that
     let x = letlazy (s1 | s2) in reify0(fun () -> model x) 
     let x = letlazy s1 in reify0(fun () -> model x) |
     let x = letlazy s2 in reify0(fun () -> model x)
where x is demanded in model. That is, reify0 should reify only the choices made by the inner program, and let the outer choices take effect. The stream generator has to be lazy, so it has the form letlazy (s1 | s2)  . Comparing the nested inference property with the code for many reveals that the key property of many p (s1 | s2) is satisfied without us needing to do anything. Our many has exactly the right semantics.

In conclusion, standard logic programming all too often forces us to choose between efficiency and expressiveness, on one hand, and completeness on the other hand. Negation and committed choice make logic programs easier to write and, in some modes, faster to run. Alas, some other modes (informally, running `backwards') become unusable or impossible. Kleene star is a perfect example of the trade-off: maximal munch simplifies the grammar and makes parsing efficient, but destroys the ability to generate grammar's language. Functional logic programming systems can remove the trade-off. Properly implemented encapsulated search (nested inference, in Hansei) lets us distinguish the choices of the parser from the choices of the stream and cut only the former. Perhaps surprisingly this distinction just falls out of the need for non-deterministic stream to be lazy. Thus Kleene star with maximal munch lets us parse and generate the complete language. In Hansei, we can have it both ways.

The current version is October 2012.
References [16K]
Hansei code for the combinator library. The code contains several unit tests for many. In addition, the arithmetic expression grammar relies on many to parse integer literals.

`Reversible' parser combinators

LogicT - backtracking monad transformer with fair operations and pruning
Extensive discussion of committed choice

Last updated January 1, 2013

This site's top page is or
Your comments, problem reports, questions are very welcome!

Converted from HSXML by HSXML->HTML