From posting-system@google.com Wed Oct 8 01:24:13 2003 Date: Tue, 7 Oct 2003 18:24:12 -0700 From: oleg@pobox.com (oleg@pobox.com) Newsgroups: comp.lang.scheme Subject: [ANN] Reading TIFF files Message-ID: <7eb8ac3e.0310071724.59bffe62@posting.google.com> Status: OR This is to announce a Scheme library to read and analyze TIFF image files. We can use the library to obtain the dimensions of a TIFF image; the image name and description; the resolution and other meta-data. We can then load a pixel matrix or a colormap table. An accompanying tiff-prober program prints out the TIFF dictionary in a raw and polished formats. http://pobox.com/~oleg/ftp/Scheme/lib/tiff.scm dependencies: util.scm, char-encoding.scm, myenv.scm http://pobox.com/~oleg/ftp/Scheme/tests/vtiff.scm see also: gnu-head-sm.tif in the same directory http://pobox.com/~oleg/ftp/Scheme/tiff-prober.scm Features: - The library handles TIFF files written in both endian formats - A TIFF directory is treated somewhat as a SRFI-44 immutable dictionary collection. Only the most basic SRFI-44 methods are implemented, including the left fold iterator and the get method. - An extensible tag dictionary translates between symbolic tag names and numeric ones. Ditto for tag values. - A tag dictionary for all TIFF 6 standard tags and values comes with the library. A user can add the definitions of his private tags. - The library handles TIFF directory values of types: (signed/unsigned) byte, short, long, rational; ASCII strings. - A particular care is taken to properly handle values whose total size is no more than 4 bytes. - Array values (including the image matrix) are returned as uniform vectors (SRFI-4) - Values are read lazily. If you are only interested in the dimensions of an image, the image matrix itself will not be loaded. Here's the result of running tiff-prober on the image of the GNU head (converted from JPEG to TIFF by xv). I hope I won't have any copyright problems with using and distributing that image. Analyzing TIFF file tests/gnu-head-sm.tif... There are 15 entries in the TIFF directory they are TIFFTAG:IMAGEWIDTH, count 1, type short, value-offset 129 (0x81) TIFFTAG:IMAGELENGTH, count 1, type short, value-offset 122 (0x7A) TIFFTAG:BITSPERSAMPLE, count 1, type short, value-offset 8 (0x8) TIFFTAG:COMPRESSION, count 1, type short, value-offset 1 (0x1) TIFFTAG:PHOTOMETRIC, count 1, type short, value-offset 1 (0x1) TIFFTAG:IMAGEDESCRIPTION, count 29, type ascii str, value-offset 15932 (0x3E3C) TIFFTAG:STRIPOFFSETS, count 1, type long, value-offset 8 (0x8) TIFFTAG:ORIENTATION, count 1, type short, value-offset 1 (0x1) TIFFTAG:SAMPLESPERPIXEL, count 1, type short, value-offset 1 (0x1) TIFFTAG:ROWSPERSTRIP, count 1, type short, value-offset 122 (0x7A) TIFFTAG:STRIPBYTECOUNTS, count 1, type long, value-offset 15738 (0x3D7A) TIFFTAG:XRESOLUTION, count 1, type rational, value-offset 15962 (0x3E5A) TIFFTAG:YRESOLUTION, count 1, type rational, value-offset 15970 (0x3E62) TIFFTAG:PLANARCONFIG, count 1, type short, value-offset 1 (0x1) TIFFTAG:RESOLUTIONUNIT, count 1, type short, value-offset 2 (0x2) image width: 129 image height: 122 image depth: 8 document name: *NOT SPECIFIED* image description: JPEG:gnu-head-sm.jpg 129x122 time stamp: *NOT SPECIFIED* compression: NONE In particular, the dump of the tiff directory is produced by the following line of code (print-tiff-directory tiff-dict (current-output-port)) To determine the width of the image, we do (tiff-directory-get tiff-dict 'TIFFTAG:IMAGEWIDTH not-spec) To determine the compression (as a symbol) we evaluate (tiff-directory-get-as-symbol tiff-dict 'TIFFTAG:COMPRESSION not-spec) If an image directory contains private tags, they will be printed like the following: private tag 33009, count 1, type signed long, value-offset 16500000 (0xFBC520) private tag 33010, count 1, type signed long, value-offset 4294467296 (0xFFF85EE0) A user may supply a dictionary of his private tags and enjoy the automatic translation from symbolic to numerical tag names. The validation code vtiff.scm includes a function test-reading-pixel-matrix that demonstrates loading a pixel matrix of an image in an u8vector. The code can handle a single or multiple strips. Portability: the library itself, tiff.scm, relies on the following extensions to R5RS: uniform vectors (SRFI-4); ascii->char function (which is on many systems just integer->char); trivial define-macro (which can be easily re-written into syntax-rules); let*-values (SRFI-11); records (SRFI-9). Actually, the code uses Gambit's native define-structures, which can be easily re-written into SRFI-9 records. The Scheme system should be able to represent the full range of 32-bit integers and should support rationals. The most problematic extension is an endian port. The TIFF library assumes the existence of a data structure with the following operations endian-port-set-bigendian!:: EPORT -> UNSPECIFIED endian-port-set-littlendian!:: EPORT -> UNSPECIFIED endian-port-read-int1:: EPORT -> UINTEGER (byte) endian-port-read-int2:: EPORT -> UINTEGER endian-port-read-int4:: EPORT -> UINTEGER endian-port-setpos:: EPORT INTEGER -> UNSPECIFIED The library uses solely these methods to access the input port. The endian port can be implemented in a R5RS Scheme system if we assume that the composition of char->integer and read-char yields a byte and if we read the whole file into a string or a u8vector (SRFI-4). Obviously, there are times when such a solution is not satisfactory. Therefore, tiff-prober and the validation code vtiff.scm rely on a Gambit-specific code. All major Scheme systems can implement endian ports in a similar vein -- alas, each in its own particular way.