read-byteor a similar function. However, almost all Scheme systems either provide such a function or let us emulate one, as a composition of
read-char. All the code in this section assumes such a native or emulated
Binary parsing and unparsing are transformations between primitive or composite Scheme values and their external binary representations. Examples include reading and writing JPEG, TIFF, ELF file formats, communicating with DNS, Kerberos, LDAP, SLP internet services, participating in Sun RPC and CORBA/IIOP distributed systems, storing and retrieving (arrays of) floating-point numbers in a portable and efficient way.
This short position talk proposes a set of low- and intermediate- level procedures that make binary parsing possible.
The bit reader is the first part of a binary parsing framework. The bit reader code was intended to be as general and optimal as possible. The timing study given in the article referenced below shows the extent both goals have been met.
The bit reader lets us read one bit from an arbitrary stream. We can then read one more bit -- or two more bits, or 3, 7, 8, 23, 30, or 32 more bits. And then 33 bits, 65535, 81920,... bits. Following the tradition of Scheme, the bit reader does not impose any artificial upper limit whatsoever on the number of bits we can attempt to read from a stream. We are naturally bounded by the size of the stream and the amount of the virtual space on the system. The validation tests included with the source code really read all the bits from a file in one swoop -- as well as one bit at a time, and many cases in-between.
The bit reader intentionally does not read ahead: no byte is
read until the very moment we really need (some of) its bits.
Therefore, the reader can be used to handle a concatenation of
different bit/byte streams strictly sequentially, without
'backing up a char', 'unreading-char' etc. tricks. For example,
make-bit-reader has been used to read GRIB files of
meteorological data, which are made of several bitstreams with headers and
tags. Careful attention to byte-buffering and optimization are
the features of this bit reader.
The code is tested on Gambit-C 3.0 and MIT Scheme 5d2. The code has R5RS versions of needed logical primitives, so it should work for any R5RS Scheme system. Despite the bit reader being so general, it is nevertheless optimized for common uses. The optimized cases stand out in the timing benchmarks discussed in the article.
Insightful discussions with Daniel Ortmann are gratefully acknowledged.
A commented source code, validation tests, and a timing benchmark.
The bit reader in the Scheme Untergrund Library
kindly ported by Martin Gasbichler.
A USENET article [plain text file]
that describes the code and presents the results of several performance benchmarks. The timings also give an insight into the performance of a Scheme system. The article was posted as _wide-range_ optimized bit reader and its performance on a newsgroup
comp.lang.scheme on Sun, 22 Oct 2000 20:20:56 GMT.
An Endian I/O port lets us read or write integers of various sizes taking a byte order into account.
The TIFF library, for example, assumes the existence of a data
EPORT with the following operations:
endian-port-set-bigendian! EPORT -> UNSPECIFIED endian-port-set-littlendian! EPORT -> UNSPECIFIED endian-port-read-int1 EPORT -> UINTEGER (byte) endian-port-read-int2 EPORT -> UINTEGER endian-port-read-int4 EPORT -> UINTEGER endian-port-setpos EPORT INTEGER -> UNSPECIFIED
The endian port can be implemented in a R5RS Scheme system if
we assume that the composition of
read-char yields a byte and if we read the whole file into
a string or a u8vector (SRFI-4). Obviously, there are times when such
a solution is not satisfactory. Therefore, tiff-prober and the
validation code vtiff.scm rely on a Gambit-specific code. All major
Scheme systems can implement endian ports in a similar vein -- alas,
each in its own particular way.
This TIFF prober code provides an implementation of the input Endian port specifically tuned for Gambit.
Advanced i/o and arithmetic compression C++ class library
Our goal is to reproduce all the functionality of that C++ library in Scheme.
TIFF library is a Scheme library to read and analyze TIFF image
files. We can use the library to obtain the dimensions of a TIFF
image; the image name and description; the resolution and other
meta-data. We can then load a pixel matrix or a colormap table. An
tiff-prober program prints out the TIFF
dictionary in a raw and polished formats.
tagdict-get-by-name TAGDICT TAG-NAME -> INT
tagdict-get-by-num TAGDICT INT -> TAG-NAME or #f
tagdict-tagval-get-by-name TAGDICT TAG-NAME VAL-NAME -> INT
tagdict-tagval-get-by-num TAGDICT TAG-NAME INT -> VAL-NAME or #f
make-tagdict ((TAG-NAME INT (VAL-NAME . INT) ...) ...) -> TAGDICT
tagdict? TAGDICT -> BOOL
tagdict-add-all DEST-DICT SRC-DICT -> DEST-DICT
read-tiff-file EPORT [PRIVATE-TAGDICT] -> TIFF-DIR
print-tiff-directory TIFF-DIR OPORT -> UNSPECIFIED
tiff-directory? SCHEME-VALUE -> BOOL
tiff-directory-size TIFF-DIR -> INT
tiff-directory-empty? TIFF-DIR -> BOOL
tiff-directory-fold-left TIFF-DIR FN SEED ... -> [SEED ...]
tiff-directory-get TIFF-DIR KEY [ABSENCE-THUNK] -> VALUE
KEYmay be either a symbol or an integer
tiff-directory-get-as-symbol TIFF-DIR KEY [ABSENCE-THUNK] -> VALUE
The library accesses the input TIFF image file solely through the methods defined for the endian port.
The announcement of the library, with examples and details [plain text file]
The article was posted as [ANN] Reading TIFF files on a newsgroup
comp.lang.scheme on Tue, 7 Oct 2003 18:24:12 -0700.
The commented source code. It explains the interface above in far more detail. Dependencies: util.scm, char-encoding.scm, myenv.scm.
The validation code.The validation code includes a function
test-reading-pixel-matrix that demonstrates loading a pixel
matrix of an image in an u8vector. The code can handle a single or
A sample TIFF file for the validation code.It is the image of the GNU head (www.gnu.org) converted from JPEG to TIFF by xv. Copyleft by GNU.
A TIFF prober program: a sample application of the TIFF library. The prober prints out the contents of a TIFF dictionary of input TIFF files.
Iteratee-based TIFF library
We show how to read IEEE binary floating-point
numbers using only procedures defined in R5RS. No special language
extensions, foreign function interfaces, or libraries are required.
The only assumption is that
returns an integer with the same bit pattern as the function's
argument, a single 8-bit ASCII character. The assumption holds for
many Scheme systems.
The code can read 4-byte single-precision IEEE floating-point
The code does not handle
NaNs, although this is trivial to add, as explained in
the comments to the code.
One can twiddle bits in Scheme after all: it is just arithmetics...
comp.lang.schemeon Wed, 08 Mar 2000 03:24:25 GMT.
This site's top page is http://okmij.org/ftp/
Converted from SXML by SXML->HTML