Lambek Grammars as an embedded DSL: a tool for semanticists




The motivation for this project was to automate the most tedious and error-prone parts of type-logical grammars. Typical research in this field involves (re-)analyzing challenging examples: drawing derivations by hand and type-setting them as figures in LaTeX. It is time-consuming and error-prone, and limits the size and the number of examples one could reasonably try. Wouldn't it be good if one could enter derivations, in a reasonably intuitive notation, and have the computer check if they are correct, with helpful error reporting? The computer could then produce the LaTeX code for the figures to put in papers. We may also collect libraries of derivations, to quickly assemble the derivation of a complicated sentence from already analyzed components.

We have built such a tool: an embedded DSL for Lambek Grammars -- to develop derivations compositionally from smaller fragments; save and reuse derivations and their fragments; check the validity of derivations and their yield; typeset them as figures. Building this DSL has lead to an unexpected theoretical result in the area of formal grammars: strong equivalence of context-free and a (naturally restricted) form of Lambek grammars -- a stronger version of the famous Pentus' result but in a very different approach. When converting a Lambek grammar to a context-free grammar (CFG) our approach avoids exponential explosion of lexicon rules and the excessive number of CFG productions.

The theoretical result is formally described in the accompanying paper. This web page is about the DSL itself and its embedding (in OCaml), mentioned only in passing in the paper. This DSL was used to produce all derivation figures in the paper and on this page.

To reiterate, the desired DSL should be reasonably intuitive to semanticists; its expressions should rather closely resemble the derivations they represent. The DSL should be modular and extensible, with easy to add lexical items. The DSL should use host language types as far as possible: invalid derivations should be immediately flagged as type errors, with good error messages. Run-time errors are to be avoided.

This page presents such a DSL for Lambek grammars. It has not been easy: the hypothetical reasoning, the characteristic of Lambek grammars, requires us to keep track of hypotheses and maintain their total order. The latter seems to require fancy types, in particular, type-level computations, not available in many languages, e.g., OCaml. The key idea (which has also lead to the theoretical result) was a new presentation of Lambek grammars, called LA, explained below. This presentation lets us view Lambek grammar derivations as an algebra, and hence embed in OCaml in the tagless-final style.

The current version is March 2020
Lambek Grammars are strongly equivalent to Context Free Grammars, for all practical purposes

Examples of using the tool

(Non-associative) Lambek calculi
Non-associative Lambek calculus is quite easier to embed: an associative operation is hard to model in term algebra, specifically, in types. Type functions seem to be needed.


Lambek Grammar: LA presentation

This section describes a non-traditional variation of a less-common natural deduction presentation of Lambek Calculus and Grammar -- to be called LA. It is provably strongly equivalent to the traditionally-presented Lambek grammars, but is easier to relate to algebras and especially context-free grammars, and also easy to embed in host languages without fancy types.

Lambek Grammars are based on the old idea of parsing as deduction, which goes back to Emil Post -- not surprisingly, since Post is both the father of the modern deductive presentation of logic and a grandfather of phrase-structure grammars. LA is, first and foremost, a deduction system (or, a calculus): a system to derive `things' from basic `things' taken as axioms, by application of re-writing, or deduction rules. The `things' derived/deduced in LA are judgements. The following explains what they are and what they are made of.

which are also called syntactic types or categories, are the formulas of LA. Meta-variables A, and B and C, possibly adorned with subscripts and diacritics, will stand for an arbitrary type. Types are inductively built from primitive types using left- and right- slashes: if A and B are types then so are B/A (to be read ``B over A'') and A\B (read: ``A under B''). One may think of B/A and A\B as fractions, with the numerator B and the denominator A. The intuition goes back to Ajdukiewicz; Bar-Hillel slanted the fractional bar making the fractional `multiplication' non-symmetric and hence suitable for describing the grammar of languages with fixed word order.

Strictly speaking LA is not a single calculus but a family, parameterized by primitive types and the lexicon (below). For the sake of the grammar of (a fragment of) English, we choose as primitive types n, np and s. As convenient abbreviations, we let vp stand for np\s, tv for vp/np, adj for n/n, det for np/n, pp for n\n and rel for pp/(s/np).

is a non-empty sequence of types and special symbol called `mark'; meta-variables Γ or Δ stand for an arbitrary environment. The concatenation is written as comma: Γ, Δ. It is associative: (A,B),C is the same environment as A,(B,C) and hence parentheses are not needed. The order of types in an environment is significant: A,B is not the same as B,A. We write * for a sequence of one or more consecutive marks; abusing the notation, * is also to be called `mark'.
is a pair of an environment and a type, written as Γ |- A. (Some people may call it sequent; note, however, that LA is based on natural deduction, not sequent calculus.) Intuitively (or logically) Γ |- A means that the formula A holds/justified based on the assumptions in Γ. One may also read the judgement as an `unreduced fraction' with A in the numerator and Γ in the denominator. Judgements are usually named, with the name written next to the judgement in parentheses, as below.
are the initial set of judgements assumed to hold. They come in two parts. First is the axiom A |- A called (Var). It is actually an axiom schema, representing a set of judgements for each particular type A. The second part of axioms is the lexicon: a group of named judgements of the form * |- A. Here is a sample:
    * |- np  (john)             * |- n   (book)    * |- det (the)
    * |- vp  (vanished)         * |- tv  (read)    * |- rel (that)
Deduction rules
are the four rules for deriving judgements. LA is a form of natural deduction, and hence its rules are the elimination and the introduction rules for formula connectives (i.e., left and right slashes).
    Δ |- B/A    Γ |- A                Γ,A |- B
   --------------------  (/e)       ----------- (/i) 
         Δ,Γ |- B                     Γ |- B/A
    Γ |- A    Δ |- A\B                A,Γ |- B
   --------------------  (\e)       ----------- (\i) 
         Γ,Δ |- B                     Γ |- A\B
The elimination rules are logically speaking modus ponens rules, if we think of slashes as implications. For example, (/e) says that if the formula B/A is justified from Δ, and, in addition, A is justified from Γ, then B is justified from Δ and Γ, in that order. The order of antecedents within the rule is also important.

One may also read the rules arithmetically. The elimination rules are about multiplication. For example, (/e) states that multiplying the fraction with the numerator B/A (which is also a fraction) and the denominator Δ by the fraction with the numerator A and the denominator Γ gives the fraction with the numerator B and the denominator Δ,Γ -- just like the fraction multiplication in elementary school, only minding the order. The introduction rules are about division, of both the numerator and the denominator by the same `number'.

A judgement is called derived if it is either an axiom or obtained from (earlier) derived judgements by the application of a deduction rule. The rules used to derive the judgement from axioms can be depicted as a tree -- the derivation tree, or, simply, derivation -- with the derived judgment as the root and axioms in the leaves. Here is a sample derivation tree, of * |- s. (Later on we shall see a more interesting derivation.)
                                  * |- det (the)  * |- n (book)
                   * |- tv (read)            * |- np           
    * |- np (john)                   * |- vp                   
                              * |- s                           
If t is a derivation (tree), its fringe is a string of words naming the axioms that appear in its derivation. It is inductively defined as follows:
  • if t is an axiom named w, its fringe is w;
  • if t is a Var axiom A |- A, its fringe is the empty string;
  • if the last rule in t is an introduction rule with the premise t1, the fringe of t is the fringe of t1;
  • if the last rule in t is an elimination rule with the premises t1 and t2, the fringe of t is the concatenation of fringes of t1 and t2.

LA may also be regarded as a grammar. Recall, a grammar is a description of a language, that is, of a set of finite strings (often called sentences) built from a finite fixed set of `words' (so-called alphabet). Sentences are commonly displayed double-quoted, with words separated by white space. The alphabet of LA are the names of its lexicon judgements. A sentence is recognized/generated by LA just in case there is a derivation of * |- As whose fringe is the sentence. Here, As is a designated so-called `initial type', which is usually s. For example, the sample LA derivation shown earlier demonstrates that the string "john read the book" is the sentence of LA.

Like other Lambek grammars, LA is `lexicalized': different LA grammars all share the same four deduction rules. The grammatical structure of a particular LA language is hence encoded in the types of the lexicon judgements.

Lambek Grammars are strongly equivalent to Context Free Grammars, for all practical purposes
The paper that develops LA and proves it strongly equivalent to LG, among other thighs. See this paper for all details.


LA derivations as an algebra

We now show how to look at LA derivation trees as an algebra, which helps relate LA with context-free grammars (CFG) and implement as an embedded DSL.

We have seen a sample LA derivation tree already. Here is a more interesting one, with hypothetical reasoning and an introduction rule. It is the derivation of * |- s whose fringe is "the book that john read vanished".

                                                                * |- tv (read)  np |- np                   
                                                 * |- np (john)        *,np |- vp                          
                                                                *,np |- s                                  
                                 * |- rel (that)                * |- s/np                                  
                   * |- n (book)                         * |- pp                                           
    * |- det (the)                                * |- n                                                   
                                          * |- np                                        * |- vp (vanished)
                                                    * |- s                                                 

As any derivation tree, it is constructed by applying the deduction rules to derived judgements, each of which comes with its own derivation tree. For example, a branch at the top deriving *,np |- s is built from * |- np (john) (a leaf) and *,np |- vp (a tree of height two) by the rule (\e) with the particular instantiation of that rule's variables A, B, Γ and Δ. That instance of the rule (which we call ehvp for easy reference) may be viewed as an operation on trees: it takes any tree deriving the judgement * |- np and any tree deriving the judgement *,np |- vp and produces a tree deriving *,np |- s. The set of trees and such operations on them is hence an algebra. It is a multi-sorted algebra: trees are grouped by the judgement they are derivation of; a derived judgement is hence a `sort' of its derivation trees. The operations used to construct the sample trees in this and the previous sections are collected in the following table:

    john     : <*;np>            evp  : <*;np>   -> <*;vp>    ->  <*;s>
    book     : <*;n>             enn  : <*;n>    -> <*;pp>    ->  <*;n>
    the      : <*;det>           edp  : <*;det>  -> <*;n>     ->  <*;np>
    vanished : <*;vp>            etv  : <*;tv>   -> <*;np>    ->  <*;vp>
    read     : <*;tv>            ehtv : <*;tv>   -> <np;np>   ->  <*,np;vp>
    that     : <*;rel>           hnp  : <np;np>
                                 ehvp : <*;np>   -> <*,np;vp> -> <*,np;s>
                                 irnp : <*,np;s> -> <*,s/np>
                                 erel : <*;rel>  -> <*;s/np>  -> <*;pp>
The table -- or, `signature' -- lists the names of operations and the sorts of their arguments and results. As mentioned earlier, sorts in our algebra are derived judgements Γ |- A, written here in a more compact form <Γ;A>. The earlier ehvp is thus a binary operation on sorts <*;np> and <*,np;vp> with the result of the sort <*,np;s>. In the first column of the table are the nullary operations (constants), which is the lexicon of LA as it is.

The algebra with the above operations whose carriers are derivation trees will be called ALD1. The number 1 refers to the fact the environment of any judgement contains at most one hypothesis (so-called `hyp-rank'). ALD1 is an initial algebra of the above signature. Although its operations and constants are a dozen, its derivation trees are infinitely many. As grammar, ALD1 hence describes an infinite language (albeit not very interesting). ALD1 may indeed be viewed as a grammar: the above signature even looks like CFG (with oddly, backwards-written productions); the constants are terminals and the sorts are non-terminals. More precisely and formally the connection to CFG is elucidated in the paper.

The algebraic view of LA immediately leads to its implementation as an embedded DSL, in the tagless-final style.

Introduction to Algebras


Embedding in OCaml

The view of LA derivations as an algebra gives a method of its embedding, that is, implementing as an embedded DSL.

The simplest embedding implements the operations of the algebra such as ALD1 as host-language (OCaml) functions. The derivation tree of * |- s with the fringe "the book that john read vanished" can then be compactly represented as an OCaml expression

    evp (edp the (enn book (erel that (irnp (ehvp john (ehtv read hnp)))))) vanished
The fact the expression type-checks tells the derivation is valid. The evaluation of the expression prints the fringe or the LaTeX code, etc. depending on the concrete implementation. Although this embedding fulfills many desiderata, it is not convenient: operations are many and hard to remember. If we need to combine trees for the judgements * |- tv and * |- np, which operation should we use?

It would be good if the computer could tell us. After all, the operation names are uniquely determined by the sorts of the trees they operate on. The next step then is to find a way to represent sorts (that is, judgements) in OCaml, and implement the operation name lookup facility. In the host language like OCaml with polymorphism and Hindley-Milner inference, one could do better: replace concrete operations with operation schemas (polymorphic functions), and let type inference instantiate them as needed. One could almost go all the way back to the four generic deduction rules -- but not quite. Concatenation of environments requires type-level programming, which is not available in OCaml (and is brittle anyway). Therefore, our DSL asks the user to manually instantiate deduction rules for specific environment.

There is an important particular case, which happens to be the commonest: an elimination rule is not being applied to two (Var) axioms. That is, if one antecedent judgement of an elimination rule is (Var), the other must be a judgement whose derivation contains at least one lexicon axiom. Incidentally, this case subsumes all hyp-rank 1 grammars. In this particular case, we only need three instances for each elimination rule: when the first antecedent is the (Var) axiom, the second antecedent the (Var) axiom, and the generic case for all other antecedents. We explain these rules by examples next.

References [6K] [7K] [2K] [7K]
Makefile [<1K]
The complete OCaml code. See Makefile for building the library


Examples of using the tool

As a short tutorial on the embedded DSL for Lambek Grammars we show a few characteristic examples of building grammatical derivations, from elementary to right-node-raising and non-constituent coordination.

The DSL can be used interactively, using the OCaml interpreter. After starting the OCaml top level and loading the library (using the file) we enter the first derivation:

    module Ex0(S:LA) = struct
      open Common(S)
      let np1 = the / (tall / boy)
The derivation is actually the / (tall / boy): the, tall and boy are the lexicon axioms and slash stands for the (/e) rule (the generic version). The OCaml let-expression assigns to the derivation the name np1 for further reference. The other things are the boilerplate, hopefully not too bothersome. The OCaml top-level responds with:
    module Ex0 :
      functor (S : Lambek_LA.LA) ->
        sig val np1 : (unit * unit, np) S.repr end
which shows the (inferred) judgement derived by np1: * |- np (the mark is represented in OCaml as unit*unit). Generally, ('e,'a) S.repr is the OCaml type representing derivations for the judgement 'e |- 'a.

The just entered and checked derivation can be reused in other derivations:

    module Ex1(S:LA) = struct
      open Common(S)
      include Ex0(S)
      let sen1 = john @% (saw / np1)
Here, @% is the infix operation representing the (\e) rule. The earlier np1 is included as a part of a larger derivation, sen1. OCaml's response:
    module Ex1 :
      functor (S : Lambek_LA.LA) ->
          val np1 : (unit * unit, np) S.repr
          val sen1 : (unit * unit, s) S.repr
Thus sen1 is the derivation of * |- s. If we make a mistake:
    module Ex1(S:LA) = struct
      open Common(S)
      include Ex0(S)
      let sen1 = john / (saw / np1)
OCaml respond with a type error:
    Line 4, characters 13-17:
4 |   let sen1 = john / (saw / np1)
Error: This expression has type (S.mark, np) S.repr
       but an expression was expected of type (S.mark, ('a, 'b) f) S.repr
       Type np is not compatible with type ('a, 'b) f 
Here, mark is displayed as S.mark (which is the alias of unit*unit); ('a, 'b) f represents the forward-slash type 'a over 'b. (The backward-slash-type is written as ('a,'b) b). The error message says that in the above expression john was expected to have a forward-slash type, rather than np.

Checking the validity of derivations (and inferring types) is just but one feature of the DSL. We can also display the derivation fringe: (the OCaml's response is indented)

    let module M = Ex1(Fringe) in M.sen1
      - : string = "john saw the tall boy"
and the derivation tree itself
    let module M = Ex1(ShowDerivASCII) in ShowDerivASCII.print @@ M.sen1
                                                * |- adj (tall)  * |- n (boy)
                                 * |- det (the)            * |- n            
                   * |- tv (saw)                   * |- np                   
    * |- np (john)                          * |- vp                          
                                     * |- s                                

The first sample derivation on this page was printed similarly. The second running example, "the book that john read vanished", was represented by the following derivation

    module PaperEx(S:LA) = struct
      open Common(S)
      let john_read_ = ir (john @% (efr read np))
      let ex = (the / (book @% (that / john_read_))) @% vanished
For clarity, it is built by parts. First is the "john read _" clause, derived with hypothetical reasoning: efr is the instance of (/e) rule whose right argument judgement is a hypothesis (the (Var) axiom) for the type np. The operation ir is the introduction rule, discharging the rightmost hypothesis. The john_read_ subderivation is then used in a larger but simpler derivation. We can compute and confirm the fringe of this derivation and display it as a tree (the one shown earlier on the page).

More interesting is right-node-raising: "john loves and bill hates mary". It is also built by parts, entering a subderivation, getting it check, confirming yield, and reusing:

    module ExRNR(S:LA) = struct
      open Common(S)
      let john_loves = ir (john @% efr loves np)
      let bill_hates = ir (bill @% efr hates np)
      let and_bh = and_ (s // np) / bill_hates
      let jl_and_bh = john_loves @% and_bh
      let sen = jl_and_bh / mary
The coordinator and_ is polymorphic, and hence takes as argument the type of the coordinated clauses, s/np in our case (represented in OCaml as s // np).

A more complicated non-constituent coordination example "john gave the present to mary on Thursday and to bill on Friday" is analyzed similarly:

    module ExNCoord(S:LA) = struct
      open Common(S)
      let maryTh = il ((efl (vp // pp) (toP / mary)) @% onThursday)
      let billFr = il ((efl (vp // pp) (toP / bill)) @% onFriday)
      let mtbf  = maryTh @% (and_ (some @%% vp) / billFr)
      let sen = john @% ((giveTo / (the / present)) @% mtbf)

The general interface of the DSL lets us prove theorems, e.g., the admissibility of the lift rule, as shown in the full code.

References [4K]
Complete code for the examples [<1K]
Initialization file to load the library in the OCaml interpreter