The feature in question is the notation for anonymous functions: the underscore. It seems that in Scala
_(just underscore) is the abbreviation for the identity function
f => f
((_:Integer).intValue)is the abbreviation for
((x:Integer) => x.intValue)
(x => Integer.valueOf(x))
anObject.method(_ ~ xxx ~ yyy)is the abbreviation of
anObject.method(f => f ~ xxx ~ yyy)
Let us briefly review the Montagovian semantics of natural languages. A declarative sentence such as
(1) John saw Mary.denotes a proposition, which is true just in case a person denoted by the name John indeed saw a person denoted by the name Mary. We can write the proposition as a formula in the Simple Theory of Types (STT):
(2) see mary johnwhere
johnare individuals (of the type
e) who go by the names Mary and John, respectively;
see, the meaning of the transitive verb `see', is a function symbol of the type
e -> e -> t. Here
tis the type of propositions. (In Church's STT, the type
iand the type
o. For some reason, Montague renamed the types).
Let us consider a sentence
(3) It was Mary who John saw.or, simpler
(4) Mary, John saw.We'd like to write a logical formula representing the meaning of (4). We would like our semantics to be compositional: the meaning of a sentence should be composed from the meaning of its parts. Thus, we have to determine the meaning of the part ``John saw.'' A transitive verb requires two arguments: first an object, then a subject. The phrase ``John saw'' seems to be missing an object. To handle such phrases -- verb phrases (VP) missing a component -- linguists have introduced a so-called `trace', often written as an underscore. The trace is assumed silent (not pronounced). Thus the sentence (4) is to be written
(4') Mary, John saw _.
The sentences with traces are common. Examples include raised questions and relative clauses:
(5) Who did John see _. (6) I know whom John saw _. (7) The person John saw _ ran away.Some theories of quantifier scope use trace to explain how a quantifier word such as `someone' influences the meaning of -- `takes scope over' -- the entire sentence. This so-called `quantifier raising' posits that a sentence such as
Alice thinks someone left.should be analyzed as if it were
someone (Alice thinks _ left).
It seems like a good idea to take the meaning of ``John saw _'' to be
(8) fun x -> see x johnThe meaning of (4') then can be composed from the meaning of its components as
(9) (fun x -> see x john) maryA beta-reduction then gives us (2). Chomsky might say that the transformation from the surface form of the utterance (4') to the logical, hidden form (2) involves a movement of Mary into the position occupied by the trace.
We now face the question of the meaning of trace and using that meaning to compositionally derive (8) from ``John saw _.'' We observe that ``John saw _'' looks like ``John saw Mary'' with ``Mary'' taken out. Therefore, ``John saw _'' is a context , a term with a hole; trace denotes the hole. Continuations are the meanings of contexts. Since ``John saw _'' is a sub-phrase of the larger sentence (4'), our context is partial -- or delimited, -- with a delimited continuation as its meaning. A delimited continuation can be captured by a delimited control operator
reset (see (shift (fun k -> k)) john)indeed reduces to
fun x -> see x johnTherefore, we may take the meaning of trace to be
(shift (fun (k -> k)). To type that expression, we need shift with the answer-type modification.
The topic of delimited control in natural languages is discussed in great detail by Chung-chieh Shan in a book chapter ``Linguistic Side Effects.'' He analyzes more complex sentences, for example:
(10) Who _ saw his mother?
A pronoun could be treated like a trace. The sentence (10) hence exhibits three components with side-effect: the question word, the trace, and the pronoun. Chung-chieh Shan explains their interactions.
Coming back to Scala, it seems that
_ is the trace and the expression boundary is a reset.
It is remarkable that natural and programming languages can inform each other. Consciously or not, Scala seems like the best example of such mutual development.
We describe the relationship between the surface form of a sentence
and its meaning as two non-trivial but mechanical interpretations of
the same (hidden) abstract form. Our approach, in the tradition of
abstract categorial grammar (ACG), deals with several languages: The
A is the language of abstract form; the language
I is the
language in which we write a compositional interpreter of
produces a term in a language
T. The language
A is simply-typed
lambda-calculus with constants.
A syntactic interpretation
Isyn gives a surface form for a sentence
Tsyn is a set of strings or utterances. This
interpretation is responsible for word order, case marking,
subject-verb agreement, etc. Therefore,
A is abstracted from these
A can be determined from
Tsyn by parsing. If
the abstract language is simple -- that is, its constants have types
of low order -- parsing is tractable. This is the case for our ACGs.
A semantic interpretation
Isem computes the meaning of
A as a logical
Tsem is thus the language of predicate logic.
deals with quantifiers, pronouns, question words and
their interactions. Our
Isem interpreter is non-trivial:
it may fail. Although we may parse a sentence to an
abstract form, we may fail to compute any meaning for it, if the
sentence just `doesn't make sense'. Our
Isem is not a regular
lambda-calculus: it expresses computational effects such as mutation
and continuations, and it is not normalizable. The interpreter
Isem :: A -> Tsem may be represented as a composition of two
I1 :: A -> T1 and
I2 :: T1 -> Tsem for some
T1. A complex transformation may be assembled
from simpler steps.
The computational effects let us express in a modular fashion the interaction of a phrase with its context, and analyze different scopes of quantifiers and scoping islands.
Abstract Categorial Grammar (ACG) is a grammar formalism based on typed lambda-calculus. The syntactic side of ACGs -- parsing algorithms, generative power and other formal-language--theoretical properties -- is under active investigation. ACGs can also elegantly model syntax-semantics interface and compute truth conditions. This semantic side of ACGs has received much less attention.
We describe a generalization of ACGs that lets us give the standard dynamic-logic account of anaphora and analyze quantifier strength, quantifier ambiguity and scope islands. Most of these ACG analyses have not been possible before; prior ACG analyzes of quantifier ambiguity required type lifting, hence higher-order ACGs with very high parsing complexity.
Our generalization to ACG affects only the mapping from abstract language to semantics. We retain all ACG benefits of parsing from the surface form. By avoiding type lifting we keep the order of the abstract signature low, so that parsing remains tractable.
The generalization relies on `applicative functors', which extend function applications. The fact that applicative functors compose lets us take the full advantage of modularity and compositionality of ACGs. We assemble the semantic mapping from separate, independently developed components responsible for a single phenomenon such as anaphora, coordination, universal or indefinite operator. We have implemented the generalized ACGs in a `semantic calculator', which is the ordinary Haskell interpreter. The calculator lets us write grammar derivations in a linguist-readable form and see their yields, types and truth conditions. We easily extend fragments with more lexical items and operators, and experiment with different semantic-mapping assemblies.
The definition of the Abstract language and its mapping to the surface `syntax'
We define Abstract for a small fragment, later extended with sentence conjunction, anaphora and quantifiers.
The language of meaning: First-order predicate logic
We also lift the logic through an applicative.
The syntax-semantic interface
We demonstrate the interface for the small fragment, and its modular extension for sentence conjunction, dynamic logic, and two levels of quantification. We illustrate the successive, multi-stage re-writing of Abstract into the meaning.
Standard CPS and State Applicatives, which could have been defined in the standard applicative library
Abstract Categorial Grammars (ACG) represent syntax-semantic interface by regarding the surface and logical forms as two different interpretations of an abstract form. All three forms are represented in ACG uniformly, as closed (normal) terms in typed linear lambda-calculus over a signature defining base types, constants and the type assignments to the constants. An interpretation of one, abstract calculus, in another, object calculus is formalized in a lexicon , which is a homomorphic, typing preserving map of constants and types of the abstract calculus into terms and types of the object calculus.
Strictly speaking, the mapping of an abstract term
tA to an object term includes not only the mapping of constants of
tA as specified by the lexicon, but also the normalization of the result. The ACG literature shows the normalization (beta-reduction) step when presenting examples; the step, however, is left implicit in the formalism. Since simply-typed lambda-calculus is strongly normalizing, the formalities are not needed.
To model semantics, that is, design an ACG lexicon to map an abstract form to a logical form, extensions are often necessary. For example, the linearity requirement is dropped. We carry out further extensions. First, we distinguish a subset of terms of the logical form calculus and call them values. We define the process of reduction of terms to values as left-to-right call-by-value evaluation. We call the logical form calculus a (semantic) metalanguage.
The mapping of an abstract term
tA to a logical form is an explicitly two-stage process: we map the constants of
tA to metalanguage terms, and evaluate the result. The evaluation may fail to yield a value: the mapping of
tA to a semantic form is partial. We take the failure as an indication of ungrammaticality of
tA . The corresponding surface forms are thus judged syntactically correct but meaningless. In a departure from the Categorial Grammar tradition, the rules of grammatical composition are specified by the evaluation rules of the metalanguage -- and only incompletely by the underlying type calculus. Types are quickly-decidable approximation of the evaluation rules. Whereas categorial grammars and type-logical grammars stress finding proofs, we stress deterministic evaluation.
To avoid type-lifting and still model (covert) movements we add to our metalanguage multi-prompt delimited continuations. Such continuations implement mutable dynamic binding (or, `local store'), which is a computational model of Cooper storage.
We can use an existing programming language as our metalanguage, which gives us a ready-made implementation of computational abstract grammars. We straightforwardly encode abstract forms and lexicons in OCaml, and see the logical forms as the result of running the corresponding OCaml programs. We also take advantage of the abstraction, modularity, and interactive development and debugging facilities of OCaml.
A Haskell translation of an early version of
symantics1.ml , for comparison
Sylvain Pogodalla: Syntax-Semantics Architecture: Comparing CG and ACG
Section 4 of the Advanced ESSLLI09 course on Abstract Categorial Grammars
< http://www.loria.fr/equipes/calligramme/acg/publications/esslli-09/2009-esslli-acg-week-2-part-2.pdf >
We use the examples from Sec 4.4.
Typed Tagless Interpretations
Abstract Categorial Grammars are the instance of Typed Tagless (Final) Interpretations.
Dynamic Logic in ACG: Discourse Anaphora and Scoping Islands
Our enhancement to ACG affects only the mapping from abstract language to semantics. We retain all ACG's benefits of parsing from the surface form. Crucially, by avoiding type lifting we keep the order of the abstract signature low, so that parsing remains tractable.
We regard the mapping from abstract language to semantics partial: some sentences, albeit well-formed, just don't make sense. We model this partial mapping as a potentially failing computation in a call-by-value language with multi-prompt delimited control. The evaluation and type inference rules of the language are simple and deterministic. Control prompts may be regarded as loci of binding or quantification, used by quantified phrases and pronouns and set by context. We arrive at the mechanism of interaction of a phrase with its context that determines the scope.
OCaml code with the analyses of the puzzles discussed in the talk
The code presents abstract forms for the puzzles, surface forms and the computed logical formulas, if exist.
< http://www.loria.fr/equipes/calligramme/acg/ >
Abstract Categorial Grammar Homepage
The delimcc library: delimited continuations for OCaml
We specify the compilation from the intermediate language to the STT denotation as reduction rules. Hence the intermediate language is (call-by-name) lambda calculus whose constants are STT formulas. To compile an intermediate language term to its STT denotation we evaluate the term. The compilation is not always compositional: so-called scopal expressions contribute their meaning in the places other than where they are seen or heard. Scopal phenomena include generalized quantification, wh-phrases, topicalization, focus, anaphora and binding. To account for these phenomena, our intermediate language includes control operators shift and reset. We are thus able to improve on the previous continuation-based analyses of quantification, binding, raised and in-situ wh-questions, binding in wh-questions, and superiority.
The evaluation of an intermediate language term does not always lead to the STT denotation: the evaluation may get stuck. In that case, we regard the corresponding source language phrase as ungrammatical. We introduce a type system to delineate a set of intermediate language terms whose evaluation never fails to produce denotation. Being well-typed becomes an alternative criterion of grammaticality. Typeability, unlike reducibility, has an advantage of being applicable to separate phrases rather than whole sentences. Our main result is that both typing and call-by-name are necessary to correctly predict superiority and binding in wh-questions with topicalization, without resorting to thunking or type raising and thus maintaining the uniformity of the analyses.
Presentation at the workshop, August 6, 2008. Hamburg, Germany.
The complete code with all the analyses described in the paper
Call-by-name typed shift/reset calculus
Our intermediate language
Joint work with Chung-chieh Shan.
oleg-at-pobox.com or oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!
Converted from HSXML by HSXML->HTML