From oleg-at-okmij.org Sun Apr 16 19:09:10 2006 To: caml-list@inria.fr Subject: ANN: Generic print function X-Comment: Jan 2011: gprint is integrated into the BER-MetaOCaml distribution, vN003 Message-ID: <20060417020910.BB844AC2A@Adric.metnet.navy.mil> Date: Sun, 16 Apr 2006 19:09:10 -0700 (PDT) Status: OR The facility that prints results and types of expressions evaluated at the top-level is now available anywhere in the program -- in bytecode- or natively compiled programs. Generic printing is a (perhaps unintentional) `side-effect' of MetaOCaml -- of the fact that a code value is not merely AST; the code value also captures the type and the type environment of variables and other values. Generic printing is a library that works with the unmodified MetaOCaml (which is _fully_ compatible with the regular OCaml). MOTIVATION First of all, there is an, arguably small, matter of print_int, print_char, etc. Most of the time the typechecker knows exactly what is the type of the data to print. Why should we spell it out in the suffix of 'print' (or as a format specifier in Printf). This small annoyance gets bigger if we deal with a complex data structure, such as a list of records whose elements are variants and arrays. There is no built-in print_xxx function for it: we _have to_ write our own. What is annoying is that OCaml knows darn well how to print the structure, if the structure is the result of an expression evaluated at top level. Alas, such printing is _not_ available in standalone programs, or if we want to use the print function somewhere in the code, where the structure is produced as an intermediate result. Such a printing is a useful debugging aid. We may also want to use the printing facility to write out the data structure, in a human-readable form, into various files. The top-level output is quite pretty and is useful beyond the top level. INTERFACE and SIMPLE EXAMPLE The core function is val fprint : Format.formatter -> ('a,'b) code -> string which takes a code value of any type, and pretty-prints it on the given formatter. The printed result is exactly the same as that by the top-level value printing. The function [fprint] returns the representation of the expression's type, as a string. The latter is arguably a frill, but it was easy to do, so just as well. For example, let pr_type et = Format.printf "\n%s@." et let () = let x = Some ([|(10,true);(11,false)|]) in pr_type (print ..) prints the following two lines: Some [|(10, true); (11, false)|] (int * bool) array option The first line is the value, and the latter (printed by pr_type) is the type. There was no need to define any custom printer for the value or its components. A more involved examples is 65 lines down in this message. AVAILABILITY Generic printing is integrated into BER-MetaOCaml distribution, vN003. http://okmij.org/ftp/ML/ DESIGN MetaOCaml lets us manipulate pieces of code as values. Whereas 1 is an int value, .<1>. is a code value, of the type ('a,int) code. MetaOCaml can print those code values: # let x = 1 in Trx.npc ..;; .<1>. # let x = 'a' in Trx.npc ..;; .<'a'>. so far, so good. However, # let x = "a" in Trx.npc ..;; .<(* cross-stage persistent value (as id: x) *)>. And here we hit the snag. If we use a different printing function, # let x = "a" in Trx.printcode ..;; expression ([0,0+-1]..[0,0+-1]) ghost Pexp_cspval (as id: "x") we see that the code value is internally an AST, Parsetree.expression. We also see that aside from a few simple cases, MetaOCaml does not inline the values from the captured variables; rather, MetaOCaml incorporates references to such variables (so-called, cross-stage persistent (CSP) variables). We need the second observation: the code value is intended to be evaluated (i.e., `run'). The compilation of a code value generally requires its type. For example, to compile [match x with Foo -> ...] we need to know the type of [x]. In particular, we need to know if [Foo] is the only variant. If so, the above match is exhaustive and we do not need to compile the default case: [_ -> raise Match_error]. Therefore, when MetaOCaml captures the reference to a CSP variable, it has to, in general, capture the type as well. And it does, in a special AST node, which contains the corresponding Typedtree.expression. The latter includes the type and the type environment with declarations, etc. These are exactly the data that top-level's generic print function needs. The common, and correct, reply to the frequently asked question as to why OCaml does not have generic print is follows: printing a value requires the knowledge of its type. Indeed, a machine integer '1' may represent, inter alia, both an integer 0 and a boolean 'false'. The type information is not preserved in the compiled code. Fortunately, MetaOCaml's code values do preserve the necessary type information. COMPLEX EXAMPLE Let us first define the following complex data type: module C = struct type 'a color = Blue | Green | Rgb of 'a end;; type 'a image = {title : string; pixels : 'a C.color array};; type big = int image list;; let v = [ {title = "im1"; pixels = [| C.Blue; C.Rgb 10 |]}; {title = "im2"; pixels = [| C.Green |]}; ] The following expression let () = pr_type (print ..) prints exactly what we expect. Before continuing the example, we should note a drawback of the current lack of integration of the generic print facility with MetaOCaml. When doing [print ..] where x is of a variant type and its current value is a constant constructor (e.g., None), we see the output '(* cross-stage persistent value (as id: x) *)'. This is a drawback of some optimizations in MetaOCaml, and will be fixed if this code is integrated into MetaOCaml. Fortunately, there is an easy workaround: replace [print ..] with [print (let z = [x] in ..)]. We now continue the example: open C let some_processing ims = let brighten px = let new_px = match px with Blue -> Green | Green -> Rgb 10 | Rgb x -> Rgb (x+1) in let () = Format.printf "@.pixel: %a -> %a @." (fun ppf v -> ignore (fprint ppf v)) (let x = [px] in ..) (fun ppf v -> ignore (fprint ppf v)) (let x = [new_px] in ..) in new_px in let process im = let () = Format.printf "Processing: " in let _ = print .. in {im with pixels = Array.map brighten im.pixels} in let res = List.map process ims in let _ = print .. in Format.printf "@." let () = some_processing v;; The list of images, an image itself, and a single pixel were all printed generically. We did not have to define any custom printers. Here's the output: Processing: {title = "im1"; pixels = [|Blue; Rgb 10|]} pixel: [Blue] -> [Green] pixel: [Rgb 10] -> [Rgb 11] Processing: {title = "im2"; pixels = [|Green|]} pixel: [Green] -> [Rgb 10] [{title = "im1"; pixels = [|Green; Rgb 11|]}; {title = "im2"; pixels = [|Rgb 10|]}] POLYMORPHISM vs. GENERIC The function print is generic but not polymorphic. For example, if we define let pr x = print ..; x and invoke the function as "pr [10]", we see the printed output "". The function 'pr' has the type 'a->'a -- that is, it promises to take the value of any type, regardless of its structure. The function does not even need to know what is the exact type of its argument, because it is irrelevant. Informally, an OCaml function of the type ['a-> ...] corresponds to the Haskell function [a -> ...]. OTH, an OCaml function of the type [('a,'b) code -> ...] corresponds to Haskell's [Typeable b => b -> ...]. The latter enables generic programming.