Classical Prolog is an enchanting misconception. It ought to be studied, for its ideas and lessons. One lesson is that guessing-- non-determinism-- is fundamental, but should not be the default mode of execution. One should guess, but guess lazily.
The strong points of Prolog can be brought into an ordinary functional programming language. Using OCaml as a representative, we implement lazy guessing as a library, with which we reproduce classical Prolog examples. Furthermore, we demonstrate parser combinators that use committed choice (maximal munch) and can still be run forwards and backwards. They cannot be written in Classical Prolog. Logic variables, unification, and its WAM compilation strategy naturally emerge as a ``mere optimization'' of the Herbrand universe enumeration.
Annotated slides of the talk at the 31st JSSST meeting, Sep 9, 2014
Source code for the Prolog examples in the first part of the talk
The complete source code for the Hansei examples in the second part of the talk
Non-deterministic choice in a conventional programming language: Enough for logic programming?
Annotated slides of the presentation at the 55th Tokyo Programming Seminar (ToPS) at the National Institute of Informatics, Japan. December 18, 2012.
An inspiring conversation with Mikael More brought up unthought-of questions. What if we do try to use Hansei to solve typical logic programming puzzles? Can we even do it? Can we do it efficiently? How much of logic programming will remain? In particular, can we read the Hansei code as a clear, declarative statement of the problem, without being distracted by the details of the solution? This present web page documents the attempts to answer all these questions. Hansei turns out suitable for logic programming after all.
We discuss logic programming without logic variables in a separate article below. Here we argue why one might want to consider Hansei when solving a typical logic programming problem.
Hansei adheres to declarative programming, by separating building probabilistic models from inference. A model describes the properties of a possible world and defines computations and constraints on them. Inference finds possible worlds satisfying the properties and the constraints. Building models and writing inference (search) procedures are distinct activities with distinct abstractions; writing custom inference procedures is rare since the ones in Hansei often suffice.
Many uses of cuts, negation and committed choice in Prolog are a way of telling that certain predicate is (semi-)deterministic; in other words, the underlying relation is functional. In Hansei, it is the determinism that is the default; deterministic computations run `at full speed' of OCaml, with no overhead. Hansei is thus particularly suitable for logic programs with a large portion of deterministic, especially numeric, computations. Hansei, as underlying OCaml, is typed. Types are notably useful in logic programming, to avoid a typical Prolog problem, of a mis-spelled constant or variable leading to an incorrect answer rather than an error message. Without types, such errors are frustratingly difficult to find.
Probability estimations turn out useful too: if we prematurely terminate a Hansei search, we obtain an estimate of the probability of missing a solution. Assigning different weights to different choices (that is, using non-uniform probability distributions) is a way to supply heuristics to bias the search.
Except for probability, the arguments for Hansei are the arguments for functional logic programming, equally applicable to the languages like Curry. The strong case for functional-logic programming has been made elsewhere, see in particular surveys by Michael Hanus. The advantage of Hansei is that it is not a separate language: it is just a user-extensible OCaml library. The arguments for Hansei also apply to embeddings of non-determinism in a typed functional language (for example, MonadPlus in Haskell), provided the crucial feature of lazy sharing is supported.
HANSEI: Embedded Probabilistic Programming
Purely Functional Lazy Non-deterministic Programming
The paper describes the importance, the equational theory and the implementation of lazy sharing. The implementation of
letlazy in Hansei is based on the technique explained in the paper.
Guessing is a part of life and science. We form a hypothesis, work out the consequences and compare with observations. Lots of problems are formulated by first assuming that the solution exists and then describing the properties it should have. Planning, scheduling, diagnostic, learning problems and Sudoku all follow this pattern. Guessing is good not only for describing these problems but also for solving them. We make a guess -- often a series of guesses -- to build, for example, a schedule, and check if it satisfies resource, timeliness and other constraints. Often, we guess again.
How do we write ``guess the value of this variable'' in code? How do we code ``guess again''? How to put in prior knowledge favoring some guesses? The talk first will answer these questions.
Naive guessing however is hopeless even for toy problems. We often have to make lots of guesses before getting a candidate solution to check against the constraints. Only a tiny or even infinitesimal proportion of these guesses yield a successful candidate. How to make good guesses? That is very hard to know: Most physical, biological, sociological, etc. models are set up to compute the consequences of causes rather than the other way around. It helps to reformulate the question: how to avoid too many bad guesses? The talk will describe and illustrate a general principle, found in any serious logic, non-deterministic or probabilistic programming system.
The techniques explained in the talk are not tied to any language and can be used even in C. However, functional, especially typed functional languages have a serious advantage, as we shall see. No prior knowledge of logic or non-deterministic programming is required. The ability to read introductory OCaml or Haskell code will be helpful. The participants will learn how to guess in their favorite language, and what it takes to succeed. They will see laziness, unification and constraint propagation in the same light.
There are several programming languages build around non-determinism: standalone -- such as Icon, Prolog, Curry -- or deeply embedded like Kanren. The talk is not about them. It is about your language. Although the talk uses OCaml for concreteness, the main points are not OCaml-specific and have been realized in other languages. We will not be talking about implementing a dialect of Prolog in your language. Rather, we argue for using non-determinism in your language directly. There is no new language to learn or a strict convention to follow. You write code as you normally do; you call any function in any library as usual -- without any wrappers and without pondering how to turn that third-party function into a relation or to make it reversible. Likewise, you can give non-deterministic callbacks to unsuspecting third-party functions.
Does this `direct guessing' measure up to bona fide logic programming?
After all, your language, like OCaml, may have no logic variables,
unification, or constraint solvers. The talk -- and this page --
describe several classical examples of logic programming, all
implemented in OCaml, with Hansei library. You can do logic
programming in your language. All you need is
fork(2) -- a form of
clonable threads with a backtrackable (world-local) state.
Tim Menzies, David Owen and Julian Richardson: ``The Strangest Thing About Software''
IEEE Computer, January 2007, pp. 54-60.
< http://menzies.us/pdf/07strange.pdf >
Abstract: ``Although there are times when random search is dangerous and should be avoided, software analysis should start with random methods because they are so cheap, moving to the more complex methods only when random methods fail.''
The article makes the compelling case for guessing as a way of solving problems. It details two examples, scheduling and finite-model checking, where random search turns out faster than sophisticated methods. These examples are not accidental, the authors argue.
The code for the puzzle at the beginning of the talk
`Reversible' parser combinators
The second example in the talk
The code for the reversible type inference, the last example in the talk
Logic programming without logic variables should be obvious. The minimal model semantics of Prolog needs no logic variables; after all, Herbrand basis is by definition a set of ground atoms. The fixpoint construction (iterating the immediate consequence operator) gives the algorithm of finding the minimal model with no use for variables or unification. The reality dims the hope: although the fixpoint construction is viable for definite Datalog programs, it is difficult to perform when the Herbrand basis is infinite and the logic program contains negation and committed choice.
Two recent theoretical results in functional logic programming, Antoy and Hanus (2006) and Dios Castro and Lopez-Fraguas (2006), point out that logic variables and non-determinism due to overlapping rules have the same computational power. The main idea is representing a logic variable by a generator of its domain, with sharing of the generator expressing repeated occurrences of the variable. The idea is intuitively clear: a logic variable indeed stands for some term, non-deterministically chosen from a suitable domain. The papers derive a systematic transformation of a program with logic variables into the one without, while preserving the set of derivations. Alas, preserving the set of derivations (reductions) is a rather weak notion of program equivalence. For example, it does not account for the multiplicity of results, considering a program that yields, say, a single 0 equivalent to a program yielding the infinite sequence of 0s. In logic programming practice, we usually do distinguish such programs, valuing programs that produce a finite set of answers in finite time. The other drawback of the theoretical frameworks used in both papers is the inability to reason about finite failure. A logic program that returns "NO" and a program that loops forever are regarded as equivalent in these frameworks. In practice we consider such programs as very different. Since the papers are written as theoretical, the details about sharing and fairness of generators are sketchy.
Aware of the theoretical hopes and practical shortcomings, we decided to press ahead and see just how far we can go expressing standard Prolog idioms in a language without logic variables. Pending adequate theoretical analysis, empirical case studies seem the only choice. The present web page documents our results. The somewhat unexpected conclusion is that we go quite far. Hansei as it is appears suitable for logic programming.
We have gone back to Herbrand: we build the Herbrand universe (the set of
all ground terms) and explore it to find a model of a program
(that is, tuples of terms that model all declared relations).
We build the universe by modeling `logic variables'
as generators for their domains. It greatly helps that Hansei is
typed; the type of a `logic variable' specifies its domain and, hence,
the generator. Since the Herbrand universe for most logic programs
is infinite, we need
non-strict evaluation. Furthermore, since logic variables may
occur several times, we must be able to correlate the generators. Although
Hansei is embedded in OCaml, which is strict, Hansei supports
on-demand evaluated non-deterministic computations with sharing:
letlazy computations. Finally, since the search space
is generally infinite, we need a systematic way of exploring it,
without getting stuck in one infinite sub-region. Again Hansei provides
a number of search strategies, including sampling and iterative deepening.
(We shall use the latter in all our examples.)
Logic variables and unification have been introduced by Robinson as a way to `lift' ground resolution proofs of Herbrand, to avoid generating the vast number of ground terms. Logic variables effectively delay the generation of ground terms to the last possible moment and to the least extent. Doing computations only as far as needed is also the goal of lazy evaluation. It appears that lazy evaluation can make up for logic variables, rendering Herbrand's original approach practical.
John Alan Robinson: ``Computational Logic: Memories of the Past and Challenges for the Future''
Proceedings of the First International Conference on Computational Logic, LNAI 1861, pp. 1-24, 2000.
John Alan Robinson reminisces about Herbrand, unification and the discovery of resolution.
Sergio Antoy and Michael Hanus: Overlapping rules and logic variables in Functional Logic Programs
J. de Dios Castro and F.J. Lopez-Fraguas: Elimination of Extra Variables in Functional Logic Programs
Proc. V Jornadas sobre Programacion y Lenguajes (PROLE'06), pp. 121-135, 2006.
appendrelation. Append relates three lists
l3is the concatenation of
l2. All throughout we take our lists to be lists of booleans. We will contrast Prolog code with Hansei code, developing both incrementally and interactively. The responses of Prolog resp. OCaml top-level are indented.
In Prolog, the append relation is stated as:
append(,L,L). append([H|T],L,[H|R]) :- append(T,L,R).declaring that an empty list is a prefix of any list, a list is a suffix of itself, and prepending to the prefix of a list prepends to the list. Certainly
appendcan concatenate two lists:
?- append([t,t,t],[f,f],X). X = [t, t, t, f, f].By passing a free logic variable as one of the two first arguments, we can concatenate not fully-determined lists:
?- append([t,t,t],X,R). R = [t, t, t|X].We see the single result, which stands for every list with the prefix
[t,t,t]. Such a compact representation of an infinite set is an asset. Alas, it is not always available, and is often a mirage. For example, if we pass a free variable as the first argument of
append, Prolog responds differently:
?- append(X,[f,f],R). R = [f, f] ; R = [_G328, f, f] ; R = [_G328, _G334, f, f] ; R = [_G328, _G334, _G340, f, f] .with an infinite set of answers, to be enumerated indefinitely. It may not be fully apparent that our Prolog code has not faithfully represented the original problem: to relate boolean lists. The last two answers describe lists of more than mere booleans. We have to impose the restriction, by defining the type predicates
bool(t). bool(f). boollist(). boollist([H|T]) :- bool(H), boollist(T).and re-writing our queries:
?- append([t,t,t],X,R), boollist(X), boollist(R). R = [t, t, t] ; R = [t, t, t, t] ; R = [t, t, t, t, t] ; R = [t, t, t, t, t, t] ; R = [t, t, t, t, t, t, t, t, t] . ?- append(X,[f,f],R), boollist(X), boollist(R). R = [f, f] ; R = [t, f, f] ; R = [f, f, f] .One of
boollist(R)would have been enough: if
Xis a boolean list, so is
R. Alas, Prolog has no type inference and is unable to infer or take advantage of that fact. To be safe, we add both predicates. The compact representation for the lists with the
[t, t, t]prefix has disappeared. More seriously, the default depth-first search strategy of Prolog gives us only half of the answers; we won't ever see the lists with an
fafter the first three
We switch to Hansei. In a typed language, types -- the specification
of the problem -- come first. We start by defining the type
boolean lists with a non-deterministic spine, along with the convenient
type bl = Nil | Cons of bool * blist and blist = unit -> bl;; let nil : blist = fun () -> Nil;; let cons : bool -> blist -> blist = fun h t () -> Cons (h,t);;and the conversion to ordinary, fully deterministic lists, to see the results.
let rec list_of_blist bl = match bl () with | Nil ->  | Cons (h,t) -> h:: list_of_blist t;; val list_of_blist : blist -> bool list = <fun>
We now define
append, seemingly as a function, in the fully standard,
also declarative, way.
let rec append l1 l2 = match l1 () with | Nil -> l2 | Cons (h,t) -> cons h (fun () -> append t l2 ());; val append : blist -> blist -> blist = <fun>
An attempt to concatenate two sample lists:
let t3 = cons true (cons true (cons true nil));; let f2 = cons false (cons false nil);; append t3 f2;; - : blist = <fun>
turns out not informative. Recall that we use the Hansei library to build a probabilistic model, which we then have to run. Running the model determines the set of possible worlds consistent with the probabilistic model: the model of the model. The set of model outputs in these worlds is the set of answers. Hansei offers a number of ways to run models and obtain the answers and their weights. We will be using iterative deepening:
val reify_part : int option -> (unit -> 'a) -> (Ptypes.prob * 'a) listThe first argument is the depth search bound (infinite, if
reify_part None (fun () -> list_of_blist (append t3 f2));; - : (Ptypes.prob * bool list) list = [(1., [true; true; true; false; false])]Running our first append model has given the expected result, with the expected weight. Following along the Prolog code, we turn to appending to an indeterminate list.
Indeterminate lists are represented by generators. Hence, we need
generators for bool and bool list domains. The generators must use
to delay the generation until the value is needed; once the value has been
determined, it stays the same.
let a_boolean () = letlazy (fun () -> flip 0.5);; val a_boolean : unit -> unit -> bool = <fun> let rec a_blist () = letlazy (fun () -> dist [(0.5, Nil); (0.5, Cons(flip 0.5, a_blist ()))]);; val a_blist : unit -> blist = <fun>
Let us see a sample of generated values, as a spot-check:
reify_part (Some 5) (fun() -> list_of_blist (a_blist ()));; - : (Ptypes.prob * bool list) list = [(0.5, ); (0.125, [false]); (0.03125, [false; false]); (0.03125, [false; true]); (0.125, [true]); (0.03125, [true; false]); (0.03125, [true; true])]The lists of different lengths have different probabilities, or different likelihoods of encountering them. We could use the accumulated probability mass as the measure of the explored search space, terminating the search once a threshold is reached. One is reminded of probabilistic algorithms such as the Rabin-Miller primality test.
We are now ready to reproduce the Prolog code, passing a `free' variable
as an argument to
append and obtaining a sequence of lists with the
given prefix or suffix. Unlike Prolog, we are
not stuck in a corner of the search space. We generate all lists
with the given prefix
[true; true; true].
reify_part (Some 5) (fun() -> let x = a_blist () in list_of_blist (append t3 x));; - : (Ptypes.prob * bool list) list = [(0.5, [true; true; true]); (0.125, [true; true; true; false]); (0.03125, [true; true; true; false; false]); (0.03125, [true; true; true; false; true]); (0.125, [true; true; true; true]); (0.03125, [true; true; true; true; false]); (0.03125, [true; true; true; true; true])] reify_part (Some 5) (fun() -> let x = a_blist () in list_of_blist (append x f2));; - : (Ptypes.prob * bool list) list = [(0.5, [false; false]); (0.125, [false; false; false]); (0.03125, [false; false; false; false]); (0.03125, [false; true; false; false]); (0.125, [true; false; false]); (0.03125, [true; false; false; false]); (0.03125, [true; true; false; false])]
append is so elegant because it defines a relation among
three lists. One can specify any two lists and query for
the other one that makes the relation hold. For example, we check if
a given list has a given prefix, and if so, remove it:
?- append([t,t],X,[t,t,t,f,f]). X = [t, f, f].
We can do the same in Hansei, effectively `running
We introduce a sample list to deconstruct and a function two compare
two lists. It is an ordinary function, not a built-in.
let t3f2 = append t3 f2;; let rec bl_compare l1 l2 = match (l1 (), l2 ()) with | (Nil,Nil) -> true | (Cons (h1,t1), Cons (h2,t2)) -> h1 = h2 && bl_compare t1 t2 | _ -> false;; val bl_compare : blist -> blist -> bool = <fun>The prefix-removal example is as follows. In words: we assert that there exists a boolean list
xsuch that appending it to
[true; true]yields the list
t3f2. We ask Hansei to find all possible worlds in which the assertion holds and report the value of
xin these worlds.
reify_part None (fun() -> let x = a_blist () in let r = append (cons true (cons true nil)) x in if not (bl_compare r t3f2) then fail (); list_of_blist x);; - : (Ptypes.prob * bool list) list = [(0.0078125, [true; false; false])]Since the search bound was
None, the returned single result came from the exhaustive enumeration of the entire search space. The search space is indeed infinite, but amenable to efficient exploration because most of the possible worlds are inconsistent with the assertion and could be rejected wholesale, without generating them. For example, when comparing an indeterminate list to an empty list,
bl_comparereturns the result after forcing a single choice. One can tell if a list is empty by checking for the first
Conscell. One does not need to know how many else elements there may be in the list. In general, if the lists are not equal, it can be determined by examining their finite prefix. Furthermore, if one of
bl_compare's arguments has a fixed length (as is the case in our example:
t3f2is the list of 5 elements), the comparison finishes after forcing finitely many choices. The ability of
bl_compareto deal with partially determined lists relates it to unification: unifying
[X|Y]with the empty list fails without the need to instantiate
Y, in effect performing the comparison with the infinite set of concrete lists.
We can likewise use
append to check for, and remove, a given suffix.
We can also split a given list in all possible ways,
returning its prefixes and suffixes. If the list is finite, we obtain
the finite number of answers, in Prolog:
?- append(X,Y,[t,t,t,f,f]). X = , Y = [t, t, t, f, f] ; X = [t], Y = [t, t, f, f] ; X = [t, t], Y = [t, f, f] ; X = [t, t, t], Y = [f, f] ; X = [t, t, t, f], Y = [f] ; X = [t, t, t, f, f], Y =  ; false.and, similarly, in Hansei:
reify_part None (fun() -> let x = a_blist () in let y = a_blist () in let r = append x y in if not (bl_compare r t3f2) then fail (); (list_of_blist x, list_of_blist y));; - : (Ptypes.prob * (bool list * bool list)) list = [(0.000244140625, (, [true; true; true; false; false])); (0.000244140625, ([true], [true; true; false; false])); (0.000244140625, ([true; true], [true; false; false])); (0.000244140625, ([true; true; true], [false; false])); (0.000244140625, ([true; true; true; false], [false])); (0.000244140625, ([true; true; true; false; false], ))]The Hansei code follows the demonstrated pattern, of building a probabilistic model returning a pair of two `random' lists
y, conditioned upon the asserted evidence that their concatenation is
t3f2. In the above code, the OCaml variable
xis used twice. Although it is bound to a generator, its two use occurrences refer to the same generated value -- thanks to
letlazyin the definition of
Our final example is
last, relating a list to its last element.
We contrast Prolog code
last(E,L) :- append(_,[E],L). ?- last(E,[t,t,f]). E = f ;
to Hansei code:
let last bl = let x = a_blist () in let e = a_boolean () in if not (bl_compare (append x (cons (e ()) nil)) bl) then fail (); e;; val last : blist -> unit -> bool = <fun> reify_part None (fun() -> last t3f2 ());; - : (Ptypes.prob * bool) list = [(0.0009765625, false)]
There are still open questions. The following code produces the result with no restrictions on the search space:
reify_part None (fun () -> let x = a_blist () in let y = a_blist () in if not (bl_compare x t3f2) then fail (); if not (bl_compare x y) then fail (); list_of_blist y);; - : (Ptypes.prob * bool list) list = [(2.384185791015625e-07, [true; true; true; false; false])]
A slightly modified code, with
(bl_compare x y) first, although
gives the same result, requires the restriction on the search
depth. The search space can no longer be efficiently enumerated.
Can we automatically reorder the comparison statements, to improve the
search? Another open problem is the `occurs check.' Its absence in Hansei
again necessitates the restriction on the search space.
None of the two open issues are fatal: restricting the search by
depth or by the remaining probability mass (effectively, by the
probability of missing a solution if one exists) is always possible
and even natural. It is nevertheless interesting to see if
the occurs check and the unification of two unbound variables
have a clear analogue in the variable-less case.
For the daily schedule for Monday to Wednesday:We take the phrase `One of the days I'll shop.' to mean that `I shop only on one day out of the three.' Otherwise, there are too many solutions: for example, I can do all the activities on each day. We develop our examples interactively, by submitting expressions to the top-level of the OCaml interpreter after loading the Hansei library. In the transcript below, the OCaml top-level responses are indented.
Which are all possible daily schedules, regarding those four events?
- One of the days I'll shop;
- One of the days I'll take a walk;
- One of the days I'll go to the barber;
- One of the days I'll go to the supermarket;
- The same day as I go to the supermarket, I'll shop;
- The same day as I take a walk I'll go to the barber;
- I'll go to the supermarket the day before the day I'll take a walk;
- I'll take a walk Tuesday.
The gist of the problem is selecting a subset of actions to do on one day from the set of possible actions. In other words, we need to sample from a powerset. Such a sampling function could have been provided by Hansei. It is however easy and illustrative to define from scratch:
open ProbM;; let powerset lst = List.fold_left (fun acc e -> if flip 0.5 then e::acc else acc)  lst;; val powerset : 'a list -> 'a list = <fun>In words: to choose a subset of the set
lstwe decide, randomly and independently, for each element of
lstwhether to take it or not. We use OCaml lists to represent sets. To spot-check our sampling procedure we compute the probability table for all samples from
exact_reify (fun () -> powerset [1;2;3]);; - : int list Ptypes.pV = [(0.125, Ptypes.V [3; 2; 1]); (0.125, Ptypes.V [3; 2]); (0.125, Ptypes.V [3; 1]); (0.125, Ptypes.V ); (0.125, Ptypes.V [2; 1]); (0.125, Ptypes.V ); (0.125, Ptypes.V ); (0.125, Ptypes.V )]Indeed all subsets are present, with the equal probabilities.
We define another simple but helpful function, which asserts the
x in the current world:
let mustbe x = if not x then fail ();; val mustbe : bool -> unit = <fun>
We are ready to encode our problem. We define the datatype enumerating the possible actions:
type action = Shop | Walk | Barber | Supermarket;;To define the model we specify the sampling procedure for the actions and enumerate the constraints, following the most straightforward generate-and-test approach.
let schedule_model () = let ndays = 3 inWe determine the set of actions to do on day
d. The set is chosen randomly from the set of all possible actions. However, if we ask again which actions were chosen for day
d, we should get the same answer. Thus
actions dacts as a logic variable: its value, initially indeterminate, once determined, does not change. We use the
memofacility of Hansei to define such memoized stochastic functions.
let actions = memo (fun d -> powerset [Shop; Walk; Barber; Supermarket]) inThe proposition
action_on a dstates that on day
dI do action
a. The proposition
only_on a dasserts that I do action
aonly on day
d(and not on any other day):
let action_on a d = List.mem a (actions d) in let only_on a d = [d] = List.filter (action_on a) [0;1;2] inWe now state the constraints. For example, the first two lines below assert that there exists a day on which I shop. I do not shop on any other day.
let d = uniform ndays in let _ = mustbe (only_on Shop d) in let d = uniform ndays in let _ = mustbe (only_on Walk d) in let d = uniform ndays in let _ = mustbe (only_on Barber d) in let d = uniform ndays in let _ = mustbe (only_on Supermarket d) in (* The same day as I go to the supermarket, I'll shop. That is, there is a day that I perform the action Shop and the action Supermarket *) let d = uniform ndays in let _ = mustbe (action_on Supermarket d && action_on Shop d) in let d = uniform ndays in let _ = mustbe (action_on Walk d && action_on Barber d) in (* I'll go to the supermarket the day before the day I'll take a walk. *) let d = uniform ndays in let _ = mustbe (action_on Walk d && d > 0 && action_on Supermarket (d-1)) in (* I'll take a walk Tuesday. *) let _ = mustbe (action_on Walk 1) inFinally, we return the schedule: the array of actions for each day. This is the output of the model.
Array.init ndays actions;; val schedule_model : unit -> action list array = <fun>The model is defined. We now `solve' it, determining if there exists a schedule satisfying all constraints. Hansei's reply is nearly instantaneous.
exact_reify schedule_model;; - : action list array Ptypes.pV = [(1.11632658893461385e-07, Ptypes.V [|[Supermarket; Shop]; [Barber; Walk]; | ])]There is only one schedule satisfying the constraints: on Monday I go to the supermarket and shop, on Tuesday I walk and take a haircut. Hansei has also computed the `probability' of the solution, which is the the estimate of the search space. Each choice in the model corresponds to a possible world; many possible worlds are not consistent with the stated constraints and facts and are rejected. The reported
1.11632658893461385e-07is the weight of the surviving possible world(s), the one(s) that satisfied all constraints. Since all our choices were uniform, the weight represents one in
1/1.11632658893461385e-07= 8957952 that many worlds. Since worlds can be rejected wholesale, Hansei did not have to generate all of them. If one fact in a world contradicts the evidence, we do not need to know the other facts; that is, we do not need to make the choices for them.
all_differentis a hassle to state and a chore to evaluate; constraint logic programming (sub)systems (e.g., SICStus) typically rely on external constraint solvers, where all-different is built-in. Even without `all-different' the puzzle is hard; the naive generate-and-test approach is too slow; a bit of creativity is required.
Hansei's formulation of the problem is straightforward and declarative: the naive generate-and-test. That was the point, to see how well Hansei copes without the benefit of a creative formulation or fine-tuning. Hansei copes quite well. It helped that deterministic parts of the puzzle more naturally express in a functional language with data structures and pattern-matching (Hansei's host language, OCaml) than in Prolog. We use non-determinism only when we need it. It also greatly helped that the all-different constraint, or sampling all permutations, turns out efficiently, lazily, implementable in Hansei.
The complete, commented source code, with every significant line annotated.
Hansei has determined that the puzzle has only one solution, whose weight is
5e-12. The search space is large indeed.
Lazy permutations and all-different constraint
The Hansei parser combinator library looks quite like the famous Parsec, by which it is inspired. To be sure, the sort of non-determinism needed for Parsec is easily emulated in many languages -- that's why Parsec has spread so widely. And yet Hansei brings something new: running Parsec parsers not only forwards, parsing a given stream, but also effectively backwards, to generate a stream.
As usual, Hansei parsers digest a stream of characters; the characters
do not need to be present all in memory, but can be read on demand.
stream is a thunk. A parser takes a stream and
returns the parsing result, the result of a semantic action, and
the remainder of the stream:
type stream_v = Eof | Cons of char * stream and stream = unit -> stream_v type 'a parser = stream -> 'a * stream
The two primitive parsers are
let pure : 'a -> 'a parser = fun x st -> (x,st)which parses the empty string returning its argument as the parsing result, and
let p_sat : (char -> bool) -> char parser = fun pred st -> match st () with | Cons (c,st) when pred c -> (c,st) | _ -> fail ()that checks that the current element of the stream satisfies a given predicate -- and if it does not, reports failure using Hansei's
fail. Other parsers are written in terms of the above, for example, the parser of a character
let p_char : char -> char parser = fun c -> p_sat (fun x -> x = c)
Combinators combine parsers and their semantic actions and express the rules of the grammar:
val (<*>) : ('a -> 'b) parser -> 'a parser -> 'b parser val (<$>) : ('a -> 'b) -> 'a parser -> 'b parser val ( *> ) : 'a parser -> 'b parser -> 'b parser val ( <* ) : 'a parser -> 'b parser -> 'a parser val p_fix : ('a parser -> 'a parser) -> 'a parser val many : 'a parser -> 'a list parser
(<*>) combines parsers sequentially. The parsers
( *> ),
( <* ) are the variations with respect to semantic
actions; their types show what they do. Alternative rules in a grammar
are expressed with the alternation combinators
let (<|>) : 'a parser -> 'a parser -> 'a parser = fun p1 p2 st -> uniformly [|p1;p2| ] st let alt : 'a parser array -> 'a parser = fun pa st -> uniformly pa stwhich rely on the Hansei operator for non-deterministic choice, of a parser from a given set.
Here is the first example: recognizing palindromes. Since we build a recognizer, the semantic actions do nothing, returning unit. The grammar reads quite like BNF: A palindrome over the two-character alphabet is either the empty string, a single character, or a palindrome flanked on both sides with the same character:
let pali = p_fix (fun pali -> alt [| empty; (fun _ -> ()) <$> p_char 'a'; (fun _ -> ()) <$> p_char 'b'; p_char 'a' *> pali <* p_char 'a'; p_char 'b' *> pali <* p_char 'b' | ] )
To parse a string according to the grammar -- to `run the parser forwards' -- we convert a string to a stream, apply the parser and check that the stream is fully consumed:
let run_fwd : 'a parser -> string -> 'a Ptypes.pV = fun p s -> exact_reify (fun () -> let (v,s) = p (stream_of_string s) in if s () <> Eof then fail (); v)We could just as well parse a file rather than an in-memory string. Evaluating
run_fwd pali "abaaba"shows that "abaaba" is a palindrome.
More interestingly, the very same parser can be run effectively backwards, telling us all the strings it parses.
let run_bwd : int option -> 'a parser -> (unit -> stream) -> ('a * string) Ptypes.pV = fun n p stm -> Inference.explore n (reify0 (fun () -> let st = stm () in let (v,st') = p st in if st' () <> Eof then fail (); (v,string_of_stream st)))Since the set of such strings can be infinite, we have to bound the search, specifying the inference depth limit as the first argument of
Nonemeans no limit). For example, to see a few palindromes, we evaluate:
run_bwd (Some 10) pali (fun () -> stream_over [|'a';'b'| ])To run a parser effectively backwards, all we need is to give the parser a random stream to parse. The function
stream_overgenerates all streams over a given alphabet -- the infinite number of them. The parser filters the set of streams; the surviving streams are then reported as parseable.
The next example finds all 5-letter palindromes. We add
stream_len st 5, which fails if the stream
does not have 5 elements:
run_bwd None pali (fun () -> let st = stream_over [|'a';'b'| ] in stream_len st 5; st)Although the code indeed generates all eight 5-letter palindromes (equally likely), it is quite puzzling. No search limit was imposed, and yet the search terminated, even though
stream_overproduces the infinite sequence of streams. Since
stream_lenrejects all streams whose length is not 5, we should get stuck generating longer and longer streams, and rejecting them all. But we don't.
Let's see the code for
stream_len and recall the type of the stream:
type stream_v = Eof | Cons of char * stream and stream = unit -> stream_v let rec stream_len st n = match (st (),n) with | (Eof,0) -> () | (Cons (_,t),n) when n > 0 -> stream_len t (n-1) | _ -> fail ()
All characters of the stream indeed do not have to be in memory at
once: they can be read on-demand, when the thunk, representing the
stream, is forced. In
stream_len code, this demand is expressed
st (). Forcing the stream brings in a new character, and the
thunk for the tail of the stream. Clearly,
stream_len st 5
forces no more than 6 thunks. If the 6th thunk yields a
function fails and never examines the stream any further. The rest of the
stream is never demanded, and is never chosen.
There remains one more mystery of
run_bwd: in its code, shown
earlier, the identifier
st is used three times, as if it were the
same sequence of characters. However,
st is a thunk, choosing the
sequence of characters on-demand. Forcing the thunk
st () in one
place (e.g., inside
stream_len) does not guarantee that the
st () somewhere else would give exactly the same
choice. Sharing a procedure for making choices does not mean that the
choices themselves are shared. For that reason, the stream generator
stream_over employs a trick:
val letlazy : (unit -> 'a) -> (unit -> 'a) let stream_over : char array -> stream = fun ca -> let rec loop () = if flip 0.5 then Cons (uniformly ca, letlazy loop) else Eof in letlazy loopThe function
letlazy, which looks from its type as an identity function, takes a thunk and returns a thunk. When the result of
letlazy thunkis repeatedly forced, it repeatedly makes the same choice. Such non-deterministic laziness requires first-class, or world-local memory -- which is what Hansei implements. In functional-logic programming, this is called ``call-time choice''. In quantum mechanics, in is called ``wavefunction collapse''. Before we observe a system, for example, before we look at a flipped coin, there could indeed be several choices for the result. After we looked at the coin, all further looks give the same value.
The source code gives another example of reversible parsing: parsing an arithmetic expression and computing its value. The grammar looks like the one shown in nearly every book on compiler construction:
let digit = (fun c -> Char.code c - Char.code '0') <$> p_sat (fun x -> x >= '0' && x <= '9') let num = (List.fold_left (fun acc d -> d + acc * 10) 0) <$> many1 digit let rec expression st = ( ((fun t1 op t2 -> op t1 t2) <$> term <*> plus_minus <*> expression) <|> term ) st and term st = ( ((fun t1 op t2 -> op t1 t2) <$> factor <*> prod_div <*> term) <|> factor ) st and factor st = ( (p_char '(' *> expression <* p_char ')') <|> num ) stThe code demonstrates both the parsing of a string and obtaining all valid arithmetic expressions, or expressions that evaluate to 5.
The parsing combinator library demonstrated that reversible parsers are just as possible in Hansei as they are in Prolog. Prolog is a separate language whereas Hansei is just an OCaml library. The parsing combinator library shows off the laziness principle: delaying the choices until the last possible moment, hoping that the moment will never arrive. We have seen how the principle has made the infinite search finite. Less haste, infinitely more speed.
Guess Lazily! Making a program guess, and guess well
The parsing combinator library was the main example in the talk and was discussed in detail.
Maximal munch, committed choice and nested inference
What is maximal munch and why it is so common that is hardly ever mentioned? A programming language specification cliche defines the syntax of an identifier as a letter followed by a sequence of letters and digits, or, in the extended BNF,
identifier ::= letter letter_or_digit*where
*, the Kleene star, denotes zero or more repetitions of
letter_or_digit. Thus in a program fragment
var1 + var2,
var2are identifiers. Why not
var? According to the above grammar every prefix of an identifier is also an identifier. To avoid such conclusions and the need to complicate the grammar, the maximal munch rule is assumed:
letter_or_digit*denotes the longest sequence of letters and digits. Without the maximal munch, we would have to write
identifier ::= letter letter_or_digit* [look-ahead: not letter_or_digit]which is not only awkward and requires the notation for look-ahead, but is also much less efficient. If
*means mere zero or more occurrences,
"var1 "will match the empty string,
"ar1". Only the last match leads to the successful parse of the identifier, recognizing
var1. Maximal munch cuts the irrelevant choices. It has proved so useful that it is rarely explicitly stated when describing grammars.
Maximal munch however all but destroys the reversible parsing, the
ability to run the parser forward (as a parser or recognizer) and
backward (as a language generator). We illustrate the problem in
Prolog. A recognizer in Prolog is a relation between two streams
(lists of characters)
Srem such that
Srem is the suffix of
In a functional language, we would say that a recognizer recognizers the
S, returning the remaining stream as
Here is the recognizer for the character
charA([a|Srem],Srem).The Kleene-star combinator (typically called
many) takes as an argument a recognizer and repeats it zero or more times. Without the maximal munch, it looks as follows:
many0(P,S,S). many0(P,S,Rest) :- call(P,S,Srem), many0(P,Srem,Rest).Thus
many0(charA,S,R)will recognize or generate the prefix of
Swith zero or more
'a'characters. Thanks to the first clause,
many0(P,S,R)always recognizes the empty string. Here is how we recognize
a*in the sample input stream
?- many0(charA,[a,a,b],R). R = [a, a, b] ; R = [a, b] ; R = [b]and generate the language of
?- many0(charA,S,). S =  ; S = [a] ; S = [a, a] ; S = [a, a, a] ; S = [a, a, a, a] ; ...
To implement the maximal munch,
many should call the argument parser
for as long as it succeeds. We need a way to tell if the parser
fails or succeeds. To branch on success or failure of a goal, Prolog
offers soft-cut. Recall, soft-cut
P *-> Q; R is equivalent to the conjunction
P, Q if
P succeeds at least once. Soft-cut commits to that choice
and totally discards
R in that case.
R is evaluated only when
fails from the outset. Soft-cut lets us write
many with maximal munch:
many(P,S,Rest) :- call(P,S,Srem) *-> many(P,Srem,Rest) ; S = Rest.Now the the empty string is recognized (i.e.,
S = Rest) only if the parser
a*in the sample input
?- many(charA,[a,a,b],R). R = [b].
becomes quite more efficient. There is only one choice, for
the longest sequence of
as. However, attempting to generate
?- many(charA,S,). <loops>leads to an infinite loop. The argument recognizer,
charA, when asked to generate, always succeeds. Therefore, the recursion in
manynever terminates. When running backwards, the recognizer tries to generate, disregarding all other choices, the longest string of
as -- the infinite string. Although the empty string is part of the language
a*, we fail to generate it.
The appearance of the soft-cut in the definition of
have already raised the alarm. After all, soft-cut embodies
negation, which is required to detect
failure and proceed. In fact, negation can be expressed via
not(P) :- P *-> fail; true.
The Hansei parser combinator library supports
many, which, unlike
the one in Prolog, no longer forces the trade-off between efficient
parsing and generation. Hansei's
many obeys maximal munch and
generates the complete language, with no omissions. Hansei lets us
have it both ways. Before describing the implementation, we show a few
many1 p = p <* many p) may nest. In Example 3,
a*adoes not recognize
a*munches the entire stream leaving nothing for the parser of the final
a. This is the expected behavior under maximal munch. Example 7 shows no parse for the same reason. Incidentally, the example demonstrates that under maximal munch,
NOT (many (p1 <|> p2) >= many p1 <|> many p2)where
>=is to be understood as language inclusion. This inequality does hold for
manywithout maximal munch (think of
(a+b)^n >= a^n + b^nfor non-negative
n). Finally in the last example we generate the complete language for
a*, including the empty string. That is,
many phas the property
many p (s1 | s2) = many p s1 | many p s2where
|means non-deterministic choice.
To implement the maximal munch in Hansei we need something like soft-cut,
the ability to detect a failure and proceed. Hansei has exactly
the right tools:
type 'a vc = V of 'a (* leaf *) | C of (unit -> 'a pV) (* unexpanded branch *) and 'a pV = (prob * 'a vc) list val reify0 : (unit -> 'a) -> 'a pV val reflect : 'a pV -> 'a
reify0 converts a probabilistic computation
to a lazy tree of choices
'a pV, whose nodes contain found solutions
V of 'a or not-yet-explored branches. The primitive
reflect, the inverse
reify0, turns a tree of choices into a probabilistic program that
will make those choices. The primitive
reify0 is fundamental in Hansei:
probabilistic inference is implemented by first reifying a program
(the generative model) to the tree of choices and then exploring the
tree in various ways.
For instance, the full tree traversal corresponds to exact inference.
Soft-cut can also be implemented as a choice-tree traversal,
below, which explores the branches looking for the first
It returns the tree resulting from the exploration, which could be
empty if no
V leaf was ever found. Soft-cut then is simply
val first_success : 'a pV -> 'a pV let soft_cut : (unit -> 'a) -> ('a -> 'w) -> (unit -> 'w) -> 'w = fun p q r -> match first_success (reify0 p) with |  -> r () | t -> q (reflect t)We write
manyin terms of the soft-cut as we did in Prolog:
let many : 'a parser -> 'a list parser = fun p -> let rec self st = soft_cut (fun () -> p st) (* check parser's success *) (fun (v,st) -> (* continue parsing with p *) let (vs,st) = self st in (v::vs,st)) (fun () -> (,st)) (* total failure of the parser *) in self
The second question is avoiding losing solutions when running the parser ``backwards''. Unlike Prolog, Hansei parsers are functions rather than relations. They take a stream, attempt to recognize its prefix and return the rest of the stream on success. They cannot be run backwards. However, we achieve the same result -- generating the set of parseable streams -- by generating all streams, feeding them to the parser and returning the streams that parsed completely. Since the number of possible streams is generally infinite, we have to generate them lazily, on demand. To ensure completeness -- to avoid losing any solutions -- the parsers should have the property
many p (s1 | s2) = many p s1 | many p s2Surprisingly,
many palready satisfies it. The trick is laziness in the stream and Hansei's support of nested inference. The primitive
reify0may appear in probabilistic programs -- in other words, a probabilistic model may itself perform inference, over an inner model. In order for this to work correctly, we had to ensure that
let x = letlazy (s1 | s2) in reify0(fun () -> model x) === let x = letlazy s1 in reify0(fun () -> model x) | let x = letlazy s2 in reify0(fun () -> model x)where
xis demanded in model. That is,
reify0should reify only the choices made by the inner program, and let the outer choices take effect. The stream generator has to be lazy, so it has the form
letlazy (s1 | s2). Comparing the nested inference property with the code for
manyreveals that the key property of
many p (s1 | s2)is satisfied without us needing to do anything. Our
manyhas exactly the right semantics.
In conclusion, standard logic programming all too often forces us to choose between efficiency and expressiveness, on one hand, and completeness on the other hand. Negation and committed choice make logic programs easier to write and, in some modes, faster to run. Alas, some other modes (informally, running `backwards') become unusable or impossible. Kleene star is a perfect example of the trade-off: maximal munch simplifies the grammar and makes parsing efficient, but destroys the ability to generate grammar's language. Functional logic programming systems can remove the trade-off. Properly implemented encapsulated search (nested inference, in Hansei) lets us distinguish the choices of the parser from the choices of the stream and cut only the former. Perhaps surprisingly this distinction just falls out of the need for non-deterministic stream to be lazy. Thus Kleene star with maximal munch lets us parse and generate the complete language. In Hansei, we can have it both ways.
many. In addition, the arithmetic expression grammar relies on
manyto parse integer literals.
`Reversible' parser combinators
LogicT - backtracking monad transformer with fair operations and pruning
Extensive discussion of committed choice
oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!
Converted from HSXML by HSXML->HTML