I/O Framework Proposal * Principles ** This is a framework This document proposes a framework for building ducts of various kinds. The proposal also defines several standard ducts such as fixed and variable block i/o ducts, file ducts and pipe ducts. Each duct can be extended (layered upon). This proposal does not care about low-level details of a duct implementation. For example, an access method (aka "URL schema") "firewire:" may be implemented by linking in an external (foreign-function) Firewire protocol library or by using appropriately-defined peek and poke functions. Accessing device registers and the i/o space, handling interrupts, accessing firmware on an i/o card are specifically not addressed by this proposal. Incidentally, all this low-level functionality is possible via a foreign-function interface. ** Layering is paramount Layering is a principal feature. One i/o stream can be layered upon another i/o stream, etc., giving rise to hierarchies or even direct acyclic graphs. For example, the lowest-level stream can read data in octets or in ethernet frames. A stream on the top of that can provide abstractions of characters or bytes. Other streams can take care of assembling data into words, data structures, or characters. It is _often_ useful to layer a stream that can read at most N units (octets, bits, etc) from a lower-level stream -- after that, the new stream should return EOF. A layered stream may transparently accumulate the SHA-1 digest of the read data and check the signature afterwards. It is important to be able to push and peel off the layers at run time. The latter is indispensable when dealing with HTTP streams that carry multi-part messages. All the issues of character encoding can be relegated to the higher-level streams. The underlying duct of a layer is always available via a parent-duct property. Using this property, you can navigate the entire sequence of layers and read or set properties in any of them. No implicit syncing is done. An inconsistency may arise -- just as it may arise if you do fopen() and then do lseek() on the result of fileno() of the opened stream. If we are working in a distributed environment, it's hard to guarantee the consistency in the "protocol." ** Extensibility The framework must allow the addition of new duct types, for example, to support various peculiar devices including terminals, protocols such as mmap, firewire, SCSI or Fibre Channel, synchronous and asynchronous i/o. Extensibility and genericity free us from the need to describe all possible kinds of i/o behavior in this proposal. Therefore, this proposal can safely delegate terminal i/o, HTTP and TLS protocols, etc. to other SRFI. The present proposal specifies just the bare minimum: the "null:" duct that can be extended to anything else, fixed and variable buffer i/o, the regular file i/o, a pipe i/o. ** It is not POSIX, but should run on POSIX We should not consider ourselves being bound by POSIX. We merely need to make sure that whatever we come up with can be implemented using POSIX -- somehow. Perhaps one Scheme function will map to several POSIX calls -- and vice versa. Many POSIX interfaces clearly bear a mark of the lack of tuple and list datatypes in C. Furthermore, our Scheme i/o functions do not have to follow POSIX semantics to the letter. Regardless, we should provide the level of control over the data transmission that some users come to expect and need. A significant part or all of the POSIX functionality will therefore be offered in one way or another. We specifically disclaim however the goal of being POSIX compliant in every detail. ** Units of i/o vary A unit of i/o is an atomic piece of data read or written through a duct. In fact, ducts transmit data in terms of vectors of units. Different ducts have different ideas of units: some ducts deal in octets (small integers), some in characters, some in frames, some in large integers or floating-point numbers, and some in user-defined structures. A unit can be of a fixed or of a variable (bit) size. A user may query a duct for the size of its units (in terms of octets of bits). All read/write interface is specifically polymorphic with respect to the units. Data transmission methods accept vectors of units to transmit, or vectors to populate with incoming data. The vectors may be uniform vectors such as u8vector, f64vector (SRFI-4), strings, or proper Scheme vectors. Q: bits? vector of bools? or a packed u8vector? ** It is not a character i/o The character i/o (along with the issues of encoding, locale, etc.) belongs to higher-level ducts. The layer approach permits us, e.g., to read raw bytes (using a lower-level stream), to suspend and resume the encoding of characters, reading of half-characters, changing encoding on the fly. We need to do all this to process HTTP streams with multi-part data, for example. The present proposal can accommodate all the variety of character i/o -- but it leaves it to some other SRFI. ** It is not risky or unproven The general approach of this proposal is closely related to the Basic IO layer of OpenSSL [BIO] and to the "connections" of Java 2 Micro Edition (J2ME) [J2ME]. These are all mature, widely used implementations. Some ideas and the inspiration are borrowed from NetGraph [NetGraph] and Plan9. Nothing in the present proposal is risky or unproven. ** Only five functions This proposal defines one data structure -- OS:duct -- and only five functions on it -- and 'read' and 'write' are not among them! The functions are: OS:duct-open, OS:duct-extend, OS:duct?, OS:duct-property-get and OS:duct-property-set!. All various i/o functions including read-char, peek-char, and write-char, and ioctl can be implemented in terms of the five basic functions. ** Properties All data transmission and control are performed by reading or writing _properties_. Properties can be "real" (that is, slots in a data structure) or "virtual". Properties can be read-only, write-only, or read-writeable. A user should never assume that a value assigned to an arbitrary property can be later retrieved from that property. Kernel hackers will easily recognize in our properties 'device registers'. We must note that the values of some properties may be procedures -- which are first class in Scheme. The reason we implement all the operations including reading, writing and closing as getting or setting properties is extensibility. A higher-level stream can "intercept" the operation of reading or writing a particular property or a set of properties. Furthermore, reducing everything to getting or setting properties makes our interface generic -- as it must be for any proposal that deals with i/o. Incidentally, the SNMP protocol is defined entirely in terms of getting and setting properties. The protocol proved itself flexible and capable of managing a vast number of various devices. SNMP is also lightweight and is easy to implement, even in firmware. ** Mini-languages To open a duct, the user must submit a URL describing the target resource. To read, write or otherwise manipulate that resource, the user must read and write appropriate duct properties. URLs along with MIB (management information base -- an SNMP term for collections of properties) are "mini-languages". However reprehensible the idea of mini-languages is to some people, we feel that it is the best approach in terms of genericity and extensibility. We must note that many other widely used protocols and interactions -- for example, HTTP, SNMP, SMTP, the /proc file system -- are formulated in terms of mini-languages. Particular ducts may choose to hide the URLs and and the acts of getting and setting properties behind a procedural interface -- thus giving a programmer a "familiar" veneer of an API. ** Errors Errors generate exceptions as described in the exception SRFI and the i/o condition SRFI. ** Blocking When creating a duct, a user may specify the desired blocking behavior. An i/o operation may block indefinitely, may block and then time-out (raising an exception), not block at all and either invoke a specified 'yield' function or immediately return from a read or write function with no data transmitted. The blocking behavior of an existing duct may be altered. It's easier to deal with timeouts as we deal with NaNs. It's too much of a hassle to check for timeouts after every operation -- just as it makes a lot of a hassle to check for overflow or bad numbers after each arithmetic operation. In Metcast, for example, the answer to a request must be generated within reasonable time. If it takes too long, I abort the query and clean up. It doesn't really matter to me which particular i/o operation or a database interaction blocked. Besides, if it takes too long to reply, a client or a web server will time out the connection anyway. I have a special function within-timeout, which is executed at the top level of the query processing. It seems Erlang promotes a similar pattern, of a worker process and a supervisor. A supervisor is in charge of monitoring the worker and checking that it makes progress within reasonable time (by a heartbeat or by other means). If a worker appears to be blocked, it is killed and restarted. ------------------------------------------------------------------------ * An OS:duct This proposal defines an abstract datatype an OS:duct. Objects of an OS:duct are recognized by a predicate OS:duct?. A duct -- which is typically a record or a closure -- is a list of properties. Some or all of the properties may be virtual. In the reference implementation [TBW] a duct is a record, the list of properties is an assoc list. A hash table is a better alternative, however. The list of properties is one field of the record. A record representing a particular kind of ducts may contain other fields for its own private use. The proposal defines two functions to create ducts, a function to read a property and a function to set a property. * Creating a duct procedure: OS:duct-open URL PROPERTY ... -> DUCT OS:duct-open is a duct factory. Here URL is a URL: - relative POSIX file name - absolute POSIX file name file://absolute-POSIX-file-name http://... ftp://... tcp://host-name-or-ip:duct string: string:char-data | command pipe:command null: PROPERTY is a (PROP-NAME PROP-VALUE) or PROP-NAME. The latter is a shorthand for (PROP-NAME #t). Many properties have default values. PROP-NAME is a symbol. Properties passed to OS:duct-open are, in general, a subset of duct properties. Usually (although not always) the property passed to OS:duct-open is available and can be retrieved (and sometimes set) from the created duct using OS:duct-property-get and OS:duct-property-set! OS:duct-open returns a created duct or raises an exception (see the IO Condition SRFI). ** Creating a duct over another duct The procedure OS:duct-open can be used to overlay a transparent or a translucent layer over an existing duct or ducts. In this case, URL will look something like "input-buffer:" or "mime-layer:", and one of the PROPERTYs will typically be (parent-duct a-parent-duct). See examples on the discussion page. If you have a duct-a and pass it, as a parent-duct, to OS:duct-open to overlay a duct-b on the top of it, you still can use duct-a. "Extending" a duct does not destroy the lower-layer duct. You have to be careful though and 'sync' the higher-level duct before manipulating a lower-level duct. Duct extension is dynamic: you add a layer to the duct, forget about the added layer and use the parent duct, add another layer to the same basic duct, etc. In fact, you must be able to add layers dynamically when processing an HTTP stream with a multipart reply, for example. * Duct predicate procedure: OS:duct? DUCT -> BOOL * Getters and setters procedure: OS:duct-property-get DUCT PROP-NAME -> VALUE procedure: OS:duct-property-set! DUCT PROP-NAME NEW-VALUE Understandably, not all duct properties may be set. Many of the settable properties can be "virtual" All fcntl and ioctl operations can be done this way For example, to shutdown a socket, we can do (OS:duct-property-set! duct 'shutdown 'sender) ------------------------------------------------------------------------ * General properties This is a MIB (in SNMP terms) Perhaps we should adopt the MIB format to some extent? Incidentally, SNMP is specified as the framework document and a collection of RFCs for various particular MIBs. This proposal is the framework document. Here's a partial list of properties, and domains of their values. Higher-level ducts may provide more properties. Some of the properties may be used in OS:open-duct, some of them may not, and some of them must be used (for ducts of a particular type). The default value of a property is the value of the property if that property may be passed to OS:duct-open, but the user chose not to. ** schema A string, read-only. The type of the duct, often the "schema" part of the URL for which the duct is open. OS:duct-extend may explicitly specify this property. OS:duct-open may not use the property. ** name Any Scheme object. The name, identification, of the particular instance of the duct. A file duct names the duct after the file name. ** read? Boolean If the duct is readable. ???: We postulate further down that only readable ducts possess the read-fn property. Should read? be thus an open-only property? ** write? Boolean If the duct can be written to. ???: open-only? ** append? Boolean, default is #f If writing to the duct always appends the data to the end of a "target resource." For some ducts, this property has no meaning. ** eof? Boolean, default is #f If the duct is currently at EOF. For some ducts, this property is settable. ** eof-handler A procedure or #f What to do on EOF. #f means that reading finishes immediately with 0 bytes transferred. A procedure is a procedure of one argument: the current duct. Its return result is disregarded. Default value: a procedure that raises an EOF exception. However the EOF condition is handled, the eof? property is asserted. ** fail? Boolean, default is #f ??? Setting fail? and eof? to #f is equivalent to the clearerr() function of C. A C++ stream has another property: 'bad', which is set when the stream could not be opened. In C++, this property was introduced because often a stream was opened by a constructor, and a constructor could not return a failure indicator. This was before C++ exceptions. If we have exceptions, we do not need a field to indicate the failure to create an object. ** schemas An open-only property. A list of schemas that are permitted to open the duct for. Default is a a symbol 'schemas-all This is useful when we try to open a file whose name is derived from data received from the network. We don't want the system to interpret such file names too permissively (and open a pipe to shell if the client managed to embed a pipe: prefix into the file name) ** timeout Polymorphic, default is 0. A number of milliseconds or other suitable time value: the i/o timeout. The operation is aborted if it times out. Value 0: the operation may block until completion. Value #f: the operation does not block. Reading or writing returns with the count of 0 if the operation cannot be completed immediately without blocking. Value thunk: If the requested operation is about to block, invoke the thunk. If (when) the thunk returns, retry the operation. ** unit A collection or #f, read-only This property indicates the transmission unit. Value #f means that the property is not applicable (e.g., the duct doesn't do any i/o: the "null" duct) or the unit is of a variable size. If the value is not #f, it is a collection of the appropriate type (and of the cardinality of at least 1 if the collection is a general vector). For example, if the unit of transfer is a byte, the value of the property 'unit' is a u8-vector. If the unit of transmission is a character, the property 'unit' is a string. ** available-read Integer or #f, read-only The number of units that can be read from the duct immediately without blocking. It may be a precise count or an (under) estimate. For example, if at least one unit of data is immediately available, the property may be set to 1. Value 0 means no unit of data is _immediately_ available. Value #f means that information is not available (e.g., because the duct is write-only or because it is closed). We should emphasize that in _general_ we cannot know if the next read operation will block. The value of the available-read property will, in this general case, be #f. In some particular cases however the duct is positive that it can give at least n units without blocking: for example, the duct already has these units in its buffer. In that case, the duct can set the available-read property to n -- or some other number not exceeding n. The properties are dynamic -- available-read will almost certainly change after a read operation. Only ducts that can be read from have this property. ** out-waiting Integer or #f, read-only Only ducts that can be written to have this property. ??? TBD ** parent-duct An OS:duct or #f A lower-level duct underneath the current one. ** sync, sync-recursively Boolean, write-only (the value to set is disregarded). Setting a 'sync' property "synchronizes" the duct to its underlying duct or to the OS. In other words, this operation commits the outstanding data to a lower duct or disregards read-ahead data. Cf. sync() of C++ iostream. Setting the property sync-recursively invokes the sync-recursively on the underlying duct as well. After sync-recursively, the data are fully committed. sync-recursively on a file duct does fsync() do guarantee that all outstanding writes are committed. ** close? Boolean. Setting it to #t closes the duct. ** read-fn, write-fn A reader and writer procedures. See OS:read and OS:write below. Only a duct that can be read has the read-fn property. Only a duct that can be written to possesses the write-fn property. ** close-fn A procedure of an argument -- an OS duct -- invoked when the duct is about to be closed. The property closed? is set when the procedure returns. ??? perhaps that should be called close-notify-fn. So far, we didn't have any need for it. ** sync-fn A procedure of an argument -- an OS duct -- invoked to sync (aka flush) the duct. ??? ??? perhaps that should be called sync-notify-fn So far, we didn't have any need for it. ** buffer-size Integer or #f The size of the current buffer, in i/o units. This property is mutually exclusive with the 'buffer' property below. ** buffer Vector, string, uniform vector, or #f The current i/o buffer ** read-pos Integer or #f. May be a big integer. An integer that represents the current position in the duct, in duct-specific terms. #f if not applicable (if the duct is not seekable). Note that the result of read-pos is generally an opaque integer. Only ducts that can be read have this property. ** write-pos Integer or #f. May be a big integer. An integer that represents the current position in the duct, in duct-specific units. #f if not applicable (if the duct is not seekable for writing). A circular buffer duct may legitimately have distinct read-pos and write-pos. Only ducts that can be written to have this property. ** ignore Integer, write-only skip the specified number of units. Note, the result of reading the read-pos property does not give the number of units read, strictly speaking. We can disregard certain number of units from ducts that are not seekable. Return the number of units ignored, or #f if not successful or not implemented. * Reading and writing Reading and writing is performed by invoking the reader and writer procedures, which are stored in properties read-fn and write-fn. The procedures have the following signatures: read-proc DUCT VECTOR START END QUALIFIERS ... -> integer read write-proc DUCT VECTOR START END QUALIFIERS ... -> integer written Here DUCT is the current duct. VECTOR is a uniform vector, an ordinary vector or a string. START and END represent the range to be read or written to. 0 <= START <= END <= (vector-length VECTOR) The procedures return the number of units that have actually been transferred. The number of units is no more than (- END START). Errors (and sometimes, EOF, if requested) raise an exception. Do we need qualifiers? * Derived functions This proposal defines a few convenient functions, implemented in terms of the five basic functions. Some of the convenient functions are defined in R5RS. (define (OS:duct-read duct buffer start end . others) (apply (OS:duct-property-get duct 'read-fn) duct buffer start end others)) (define (OS:duct-write duct buffer start end . others) (apply (OS:duct-property-get duct 'write-fn) duct buffer start end others)) (define (OS:input-duct? duct) (and (OS:duct? duct) (handle-exceptions exn (begin #f) (OS:duct-property-get duct 'read-fn)))) (define (OS:output-duct? duct) (and (OS:duct? duct) (handle-exceptions exn (begin #f) (OS:duct-property-get duct 'write-fn)))) (define (OS:duct-flush duct) (OS:duct-property-set! duct 'sync #t)) (define (OS:duct-close duct) (OS:duct-property-set! duct 'close? #t)) Also define: read-char write-char peek-char char-ready? ------------------------------------------------------------------------ * Predefined ducts ** null ** fixed input buffer Properties: -- parent-duct: the duct to use to fill the buffer from. The duct should be open and we should be able to read from it. -- buffer: the buffer to use. The buffer should be consistent with the units of transfer of the parent duct. That is, if the parent duct reads characters, buffer might be a string (but not a u8-vector). -- available-read: optional property. If it is set and not 0, tells that the given input buffer is already filled with data to some extent. The reading from the buffer duct will consume this data first, and then will use the backing duct to fill the buffer. -- read-pos: optional. If specified and is greater than 0, it limits the portion of the buffer with the useful data to be from read-pos to read-pos + available-read (exclusive). -- buffer-size: this property is incompatible with the three properties above. If buffer-size is given, we ask the duct to allocate its own buffer of the given size. The type of the buffer will be consistent with the type of the units of the parent duct. The input-buffer: layer can support buffered file, network, etc. operations. Furthermore, overlaying an input-buffer: duct over a null: duct and passing it a buffer with useful data gives us a memory-backed duct similar to an input string port of SRFI-6. ** fixed buffer ** variable buffer ** file O_NONBLOCK do not block on open O_APPEND append on each write O_CREAT create file if it does not exist O_TRUNC truncate size to 0 O_EXCL error if create and file exists O_SHLOCK atomically obtain a shared lock O_EXLOCK atomically obtain an exclusive lock O_DIRECT eliminate or reduce cache effects O_FSYNC synchronous writes O_NOFOLLOW do not follow symlinks 'create Boolean #f Create if not exists append fcntl/O_APPEND and open/O_APPEND Exceptions: see i/o condition SRFI and open(2) man page Property clone: do fcntl/F_DUPFD and clone the duct over the new descr. close-on-exec: fcntl/F_GETFD and F_SETFD property sigio-own Get the process ID or process group currently receiving SIGIO and SIGURG signals; process groups are returned as negative values fcntl/F_GETOWN/F_GETOWN async property should be assignable to process id, flag. apppend, direct, async: settable advisory-lock: procedure or #f fileno (native handler: use with care!) getting the stat of an open file. ** pipe ** clientsocket: property 'shutdown ** The names of other URL schemas, to be defined in other SRFI ftp, http, ... * Examples and use cases ** implement string ducts ** reading from an ordinary file (blocking) ** appending to a file (no contention) ** Random writing to a file (with advisory locking and timeout) ** Non-blocking reading from a pipe ** Reading from a file in a multi-threaded program ** reading from an HTTP stream Example of a fixed-buffer overlay ** other examples reading from a mmap-ed file, sockets * Sources and inspiration Gambit-C Scheme system's ducts. The C++ stream library (old edition) Netgraph Plan9 Tcl channels? BIO of OpenSSL SNMP/MIB SNMP is specified as the framework document and a collection of RFCs for various particular MIBs. ------------------------------------------------------------------------ * References [BIO] BIO -- basic i/o abstraction of OpenSSL http://www.openssl.org/docs/crypto/bio.html [J2ME] Reference?? [NetGraph] http://www.FreeBSD.org/cgi/man.cgi?query=netgraph&sektion=4 As you see, it's a very flexible architecture. People have implemented various kinds of protocols (such as PPPoE, MAC-address filtering, etc. etc). But Netgraph is a kernel thing. I want to have something like that, but in the userland. It seems most of it can be easily implemented, with sockets and file handles as the backend. [Plan9] The best places are http://www.vitanuova.com/inferno/index.html http://www.vitanuova.com/plan9/main.html BTW, Plan9 documentation (specifically, man pages for 9P) http://plan9.bell-labs.com/magic/man2html/5/0intro suggest how we can "fold" various directory function into our file-based framework. However, perhaps the topic of directories should be explored at a later time. [SNMP] Reference??