Algorithms and Data Structures

Generic Zipper and its applications
LogicT - backtracking monad transformer with fair operations and pruning
The LogicT code includes solving the Missionaries and Cannibals problem, and playing Tic-Tac-Toe with minimax search and heuristics
lambda to SKI: the first compositional translation
Finally, the embedding of postfix languages one can really use
Incremental, Linear Pretty-printing
Secure Counting of members of a subset without revealing their identities
Normal-order evaluation of lambda-terms as a bottom-up parsing
Provably perfect random shuffling and its pure functional implementations
Iterated zipping puzzles
Annotating trees post factum
Breadth-first labeling
Eratosthenes and other number sieves
How to take a tail of a functional stream
Dynamic epistemic logic puzzles
Total stream processors and their applications to all infinite streams
Selecting a random node from a tree in one pass: from proof to code
Optimal justification of text with Dynamic Programming
Fast computation of Bernoulli numbers
Deriving fast functions to compute all subsets of size N
How to zip folds: A complete library of fold-represented lists
Representing knowledge about knowledge: ``Mr.S and Mr.P'' puzzle
Left fold vs. right fold: theory vs. practice
Generating efficient median filters by an applicative-order term rewriting system
Accumulating tree traversals, a better tree fold with applications to XML parsing
Grasping `all-the-apples-at-once'
Recursively enumerating binary arithmetic relations -- from addition to logarithm -- as conjunctions and disjunctions
Intersection testing of two segments of a circle or of two georectangles, with a correctness proof
Arithmetic compression
Towards the best collection traversal interface: from enumerator to cursor
On parent pointers in trees
Printing the outline of a pruned tree
Efficient integer logarithm of large numbers in any base
Powers of three
Cross-product of sequences
The Music of Streams on a strict instrument
Surprisingly simple fast typed-aligned queues
Sparse Grids: multi-dimensional interpolation, approximation and integration
Proving complexity of one particular implementation of priority queues
Treaps: An ordered dictionary data structure, based on randomized search trees
Pure functional, mutation-free, efficient double-linked lists
Pure-functional transformations of cyclic graphs: viral flattening, and the Credit Card Transform
Chess Tournament Scheduling
A neuron that learns how to play blackjack
Generating interesting lambda-terms

Representing knowledge about knowledge: ``Mr.S and Mr.P'' puzzle

We describe a concise Haskell solution to the ``Mr.S and Mr.P'' puzzle. We rely on the straightforward encoding of multiple-world semantics of modalities.

The problem was posed by John McCarthy as follows. We pick two numbers a and b, so that a>=b and both numbers are within the range [2,99]. We give Mr.P the product a*b and give Mr.S the sum a+b. The following dialog takes place:

Mr.P: I don't know the numbers
Mr.S: I knew you didn't know. I don't know either
Mr.P: Now I know the numbers
Mr.S: Now I know them too

Can we find the numbers a and b?

The following Haskell code demonstrates a generic method of encoding facts, and the knowledge about facts, and the knowledge of the knowledge, etc. Incidentally, compared to the notation in McCarthy's paper, the Haskell notation is notably concise.

Chung-chieh Shan commented: ``The basic idea is to think of a set of possible worlds. Corresponding to each person (whose knowledge is being modeled) is a partition of this set of possible worlds; each partition contains one or more worlds that this person cannot distinguish. For someone to know a fact is for all of that person's indistinguishable possible worlds to verify that fact. For Alice to know that Bob doesn't know the weather, is for all of Alice's possible worlds (relative to the real world) to reside within a Bob-partition in which the weather is not consistent across all worlds.''

Version

The current version is 1.3, Jun 23, 2006

References

John McCarthy: Formalization of two Puzzles Involving Knowledge. 1987
<http://www-formal.stanford.edu/jmc/puzzles.html>

Mr-S-P.lhs [4K]
Complete literate Haskell98 code
It was first mentioned in the message posted on Lambda-the-Ultimate on Jan 27, 2003. The present version adds a straightforward memoization.

Hans P. van Ditmarsch, Ji Ruan and Rineke Verbrugge: Sum and Product in Dynamic Epistemic Logic
Journal of Logic and Computation, 2008, v18, N4, pp.563--588.
The paper discusses at great extent the history of the puzzle, its modeling in modal `public announcement logic', and solving using epistemic model checkers.

Dynamic epistemic logic puzzles

Inspired by Hans van Ditmarsch's tutorial course on Dynamic Epistemic Logic at NASSLLI 2010, we present a simplistic model-theoretic framework to solve the puzzles like the following:

Anne, Bill and Cath each have a positive natural number written on their foreheads. They can only see the foreheads of others. One of the numbers is the sum of the other two. All the previous is common knowledge. The following truthful conversation takes place:

Anne: I don't know my number.

Bill: I don't know my number.

Cath: I don't know my number.

Anne: I now know my number, and it is 50.
What are the numbers of Bill and Cath?
The puzzle appeared as Problem 182 in the November 2004 issue of Math Horizons.

We encode the statement of the problem as a filter on possible worlds. The possible worlds consistent with the statement of the problem are the solutions. `Agent A does not know proposition phi' is interpreted as the statement that for all worlds consistent with the propositions that A currently knows, phi is true in some worlds but false in the others.

Version

The current version is 1.1, June 2010

References

DynEpistemology.hs [10K]
The complete commented Haskell98 code

H.P. van Ditmarsch, W. van der Hoek, and B.P. Kooi: Dynamic Epistemic Logic and Knowledge Puzzles
Proc. 15th International Conference on Conceptual Structures (ICCS). LNCS 4604, pp. 45-58, Springer, 2007.

Representing knowledge about knowledge: ``Mr.S and Mr.P'' puzzle

Selecting a random node from a tree in one pass: from proof to code

We present a procedure that picks a uniformly distributed random node from a tree. We traverse the tree only once and we do not know beforehand the number of nodes in the tree. The provably correct algorithm is an instance of a Reservoir sampling.
The procedure is written in the pure functional subset of R5RS Scheme and comes with the correctness proof. We must stress that the proof was developed not after the implementation but along with the implementation. In our experience, thinking about the proof and writing it down notably helped design and code the algorithm. Once the proof was written, the code followed immediately. The code worked on the first try.
Version: The current version is April 15, 2003
References: random-tree-node.scm [6K]
The complete source code, the proof of the algorithm, and the validation tests
The code was originally posted in the article Re: random node in tree on the newsgroup comp.lang.scheme on Tue, 15 Apr 2003 22:17:15 -0700

Optimal justification of text with Dynamic Programming

This code demonstrates Dynamic Programming on the problem of pretty printing a paragraph of text on a printer with fixed-width fonts. The goal is to tightly arrange a given sequence of n words within page margins, maximizing the overall neatness. To be more precise, we wish to minimize the sum, over all lines except the last, of the cubes of the number of blank characters at the end of each line. See the comments in the code for more details.
The algorithm has O(n^2) time and space complexities.
Version: The current version is 2.0, April 2001
References: word_layout.cc [9K]
Commented C++ source code and sample output, with many annotations.
Z. Galil, K. Park: A linear-time algorithm for concave one-dimensional dynamic programming
Information Processing Letters, v33, N6, 309-311, 1990.
David Eppstein, livejournal.com user 11011110, pointed out that the present problem is an instance of concave 1d dynamic programming, which admits a linear-time solution.

Deriving fast functions to compute all subsets of size N

The following two-part article attempts to design the fastest solution to the problem of finding all subsets of a given size from a given set. The precise problem is: given a set L and a number N, return the set of all subsets of L of cardinality N. Sets are represented by lists. We will be using R5RS Scheme.

In part 1, we start with the mathematical definition of the problem, which leads to a simple, correct, but inefficient solution. We then try to systematically optimize the function until we end up with the fastest function, which is notably faster than the other solutions proposed so far. The final solution is still pure functional. We also demonstrate that the choice of the Scheme interpreter does matter in relative performance of various algorithms.

In part 2, we again start with the mathematical definition of the problem, which leads to a simple, correct, and stunningly efficient solution. The final, so far the fastest solution is still pure functional. The key was to choose the right definition.

In the discussion, Doug Quale presented lazy stream implementations in Haskell and Scheme, and compared them with the above. Eli Barzilay described various memozied versions, which have even better performance. The two USENET threads contain the excellent discussion of the relative merits of memoization and laziness, contributed by Doug Quale and Eli Barzilay. The threads also include many timing comparisons.

References

subsets-size-n-part1.txt [13K]
Part 1 of the article
It was originally posted as Re: Subsets of a list on the newsgroup comp.lang.scheme on Sat, 12 Jan 2002 00:52:23 -0800

subsets-size-n-part2.txt [9K]
Part 2 of the article
It was originally posted as The FASTEST subsets function [Was: Subsets of a list] on the newsgroup comp.lang.scheme on Sat, 12 Jan 2002 00:56:01 -0800
The article is updated with a more optimized solution, which should perform better when compiled.

Discussion threads of the above titles, comp.lang.scheme, Jan 9-18, 2002.

Eratosthenes and other number sieves

Computing prime numbers is an important practical task as well as a common example for programming language tutorials. The Eratosthenes sieve is probably the most familiar algorithm for determining prime numbers. Alas, quite many implementations that call themselves Eratosthenes sieve do not actually implement that algorithm. For example, the classic Haskell code

    primes = sieve [ 2.. ] where
      sieve (p:x) = p : sieve [ n | n <- x, n `mod` p > 0 ]

is not the Eratosthenes sieve. For one thing, it uses the division operation (or, mod). The Eratosthenes sieve specifically avoids both division and multiplication, which were quite difficult in old times. Mainly, as Melissa O'Neill explains below, the code above tests every number for divisibility by all previously found primes. In contrast, the true Eratosthenes sieve affects only composite numbers. The prime numbers are `left out'; they are not checked by division or numeric comparison.

Given below are several implementations of the true Eratosthenes sieve algorithm, in Scheme, Scheme macros, and Haskell. The algorithm is usually formulated in terms of marks and crossing off marks, suggesting the imperative implementation with mutable arrays. The Scheme code follows that suggestion, using two important optimizations kindly described by George Kangas.

Eratosthenes sieve, however, can be implemented purely functionally, as Scheme macros and Haskell code demonstrate. The Haskell implementation is not meant to be efficient -- rather, it is meant to be purely functional, insightful, minimalist, and generalizable to other number sieves, e.g., `lucky numbers'. Like other Haskell algorithms it produces a stream of prime numbers. The Haskell implementation stores only marks signifying the numbers, but never the numbers themselves. Not only the implementation avoids multiplication, division or the remainder operations. We also avoid general addition and number comparison. We rely exclusively on the successor, predecessor and zero comparison. The predecessor can be easily eliminated. Thus the algorithm can be used with Church and Peano numerals, or members of Elliptic rings, where zero comparison and successor take constant time but other arithmetic operations are more involved.

Version

The current version is 1.5, Feb 12, 2007

References

Melissa O'Neill: Re: Genuine Eratosthenes sieve
Messages explaining the sieve algorithm and its differences from impostors; posted on the Haskell-Cafe mailing list, February 2007.

Eratosthenes-sieve.txt [6K]
Eratosthenes sieve and its optimal implementation
The explanation of the original Eratosthenes sieve and its optimizations.
The original article was posted as Re: arbitrary precision rationals on a newsgroup comp.lang.scheme on Tue, 13 Nov 2001 15:07:34 -0800

number-sieve.lhs [4K]
The literate Haskell98 source code for pure functional, minimalist Eratosthenes and lucky number sieves
The code was originally posted in the article Even better Eratosthenes sieve and lucky numbers on the Haskell-Cafe mailing list on Mon, 12 Feb 2007 18:37:46 -0800 (PST)

A stress test of the syntax-rule macro-expander
Eratosthenes sieve as a syntax-rule macro, to perform primality test of Church-Peano numerals at macro-expand time

Lucky numbers: another number sieve
<http://mathworld.wolfram.com/LuckyNumber.html>
<http://www.research.att.com/~njas/sequences/A000959>

Random Relative Primes

Cross-product of sequences

This article shows a solution (in Scheme) to a problem of computing a cross-product of n sequences. For example

    (cross-product '((0) (0 1 2 3) (1 2)))

evaluates to

    ((0 0 1) (0 0 2) (0 1 1) (0 1 2) (0 2 1) (0 2 2) (0 3 1) (0 3 2))

The solution is especially elegant if we use the standard (SRFI-1) append-map function.

Version

The current version is 1.1, January 11, 2001

References

cross-product.txt [2K]
Re: combinatorial stuff ?? The article with the complete code and transcripts, posted on the newsgroup comp.lang.scheme on Thu, 11 Jan 2001 19:59:45 GMT

Powers of three

A simple self-contained C code that computes and prints out 3^N very fast. A non-negative integer exponent N may be as big as 2000.

The code is the intended solution to a problem presented at the 1995 Programming Contest organized by University of North Texas' Chapter of ACM. The full text of the problem is given in the title comments to the code.

What counts is the overall speed, of computing the result and converting it to ASCII. And fast the code is: on HP 9000/770/J210, it completes 3^2000 under 0.09 seconds, whereas bc takes 0.3 seconds of user time. The present code uses no multiplications or divisions to compute and print 3^N.

Version

The current version is 1.1, March 1995

References

pow3.c [6K]
Commented C source code

A neuron that learns how to play blackjack

A very simple C code that learns how to play blackjack, by playing it with itself. The code implements a bona fide neural network with back-propagation. The network is made of a single neuron, possessing a single byte of intelligence. The neuron has ``weights'' and a ``threshold''; the neuron ``fires'' when the value of the activation function computed over the current input exceeds the current threshold. In game terms, using the current score and the number of aces on hand the neuron decides if to draw another card or to stay. The threshold (the neuron's memory) is adjusted as the game proceeds, using back-propagation -- if you can discern it in such a primitive setting.
Version: The current version is 1.1, December 18, 1997
References: black-jack.c [10K]
ANSI C source code, albeit written in a C++ style. The code and its sample output are well annotated.

Algorithms and Data Structures

Last updated March 20, 2020