Patch-free User-level Link-time intercepting of system calls and interposing on library functions

Contents

  1. Introduction
  2. Statement of the Problem
  3. Solutions
    1. Linux 2.x and GNU ld
    2. HP-UX 10.x and a native ld
    3. Solaris 2.6 and a native ld
    4. FreeBSD 3.2 and a GNU ld
  4. Application: Extended File Names and Virtual File Systems
 

This is an explanation of a technique to effectively "substitute" ("rename", "interpose on") an open(2) libc function, or other similar functions and system calls. This technique

Yet all application's attempts to open files are routed to my version of open(), which may call the regular open() in some cases.

This technique has been tested to work under GNU/Linux 2.0.27+ and 2.2.10+, gcc 2.7.2.3 and egcs 2.91.66 (Slackware and S.u.S.E. distributions). By this virtue, the trick will work with GNU ld on any other platform. The technique also works on SunSparc/Solaris 2.6 with Sun's native ld, and HP-UX 9000/770 B.10.10, using HP's native ld (HP-PA does not permit GNU ld). The method will work on the other UNIX platforms as well, although the precise set of ld's flags necessary to accomplish the trick may vary from one UNIX/ld vendor to another.
 

Statement of the Problem

Suppose a file mc_open.o implements a function mc_open(), which has the same interface as the standard POSIX function open(2). Function mc_open() may, for example, examine the name of the file to open and opening modes, consult "access control lists", and eventually invoke open(2), passing the same or altered or substituted file name and opening modes.

Suppose an object vendian_io.o invokes open(2), either directly or indirectly: For example, vendian_io.o may use fopen() or C++ fstream's.

We want to link vendian_io.o and mc_open.o in such a way that every time vendian_io.o calls open(2) -- either directly or indirectly -- the call is forwarded to mc_open(). mc_open() may then examine its arguments and eventually invoke the true open(2) itself.
 

Solution 1: using Linux 2.x and GNU ld

Note that the source of either vendian_io.o or mc_open.o is not needed: we deal solely with already compiled, object code.

Step 1

Create a file mc_open_glue.c:

/*
 ************************************************************
 * This is a glue code that "renames" our mc_open() into open(),
 * so that our mc_open() takes place of (interposes on)
 * the ordinary open(2) system call.
 ************************************************************/

extern int mc_open(const char *filename, const int mode, const int mask);
int open();     /* Suppress errors as open(2) is declared with varargs...
                   on GNU/Linux and FreeBSD... */

int open(const char *filename, const int mode, const int mask)
{
#if !defined(DEBUG)
  return mc_open(filename,mode,mask);
#else
  int result;  printf("\nOpening '%s'...",filename);
  result = mc_open(filename,mode,mask); printf("\nresult... %d",result);
  if(result < 0 )
     perror("opening error");
  return result;
#endif
}
Unless DEBUG is defined, this glue function merely exits to mc_open().

Step 2

Compile mc_open_glue.c above obtaining mc_open_glue.o

Step 3

Make an object file open_ext.o as follows:

  ar xv /usr/lib/libc.a open.o sysdep.o
  ld -r -x -wrap open -defsym __wrap_open=__libc_open \
  -defsym __open=mc_open \
  mc_open.o open.o sysdep.o -o open_ext.o

Step 4

Link vendian_io.o with mc_open_glue.o and open_ext.o made in the above two steps. This gives us the final executable:

  gcc vendian_io.o mc_open_glue.o open_ext.o -o vendian_io -lm
Note that libc is linked dynamically, as it usually is by default; mc_open.o may be linked in dynamically as well.
 

Solution 2: using HP-UX 10.x and a native ld

The only difference from Solution 1 above is Step 3, which now should read:

Step 3

Make an object file open_ext.o as follows:

  ar xv /usr/lib/libc.a t_open.o
  ld -r -h open -v -B immediate mc_open.o t_open.o -o open_ext.o
The rest of the procedure is identical.

 
 

Solution 3: using Solaris 2.6 and a native ld

The only difference from Solution 1 above is Step 3, which now should read:

Step 3

Make an object file open_ext.o as follows:

  ar xv /usr/lib/libc.a libc_open.o open.o open64.o
  ld -r -B local -z redlocsym mc_open.o libc_open.o open.o open64.o -o open_ext.o
The rest of the procedure is identical.

 
 

Solution 4: FreeBSD 3.2 and a GNU ld

Steps 1 and 3 from Solution 1 above have to be altered as follows:

Step 1

Create a file mc_open_glue.c:

/*
 ************************************************************
 * This is a glue code that "renames" our mc_open() into open(),
 * so that our mc_open() takes place of (interposes on)
 * the ordinary open(2) system call.
 ************************************************************/

extern int mc_open(const char *filename, const int mode, const int mask);
int open();     /* Suppress errors as open(2) is declared with varargs...
                   on GNU/Linux and FreeBSD... */

#if defined(__FreeBSD__)
/* Redirecting a wrapped open() in mc_open() to _open() which
   actually traps into the kernel. */
int __wrap_open(const char *filename, const int mode, const int mask)
{ return _open(filename, mode, mask); }
#endif

int open(const char *filename, const int mode, const int mask)
{
#if !defined(DEBUG)
  return mc_open(filename,mode,mask);
#else
  int result;  printf("\nOpening '%s'...",filename);
  result = mc_open(filename,mode,mask); printf("\nresult... %d",result);
  if(result < 0 )
     perror("opening error");
  return result;
#endif
}

Step 3

Make an object file open_ext.o according to the following:

  ld -r -x -wrap open mc_open.o -o open_ext.o
The other steps are the same.

 
 

Application: Extended File Names and Virtual File Systems

Extended file names are the ones that may have "pipes" in them, for example,
   "gunzip < /tmp/aa.gz |"
   " | gzip -best > file.gz " or even
   " cat file-name | tee transcript |"
Another example of an extended file name is "tcp://localhost:13". These are the names of "files", and as such, can be passed to open(), fopen(), fstream(), with-input-from-file, etc.

This file name extension is implemented on the lowest possible level, right before a request to open a file is passed to the kernel, by a system call open(2). A function sys_open() (in a source file sys_open.c) acts as a "patch": that is, if you call sys_open() instead of open() to open a file, you get all the open() functionality plus the extended file names. Note, neither the kernel, nor libc nor any of user or system files and libraries are actually patched or modified in any way.

The function sys_open() is a preprocessor to open(2) that can handle extended file names like "cmd |", "| cmd", or "tcp://hostname:port" where cmd is anything that can be passed to /bin/sh. The shell /bin/sh is launched in a subprocess to interpret the cmd; the shell's stdin, stdout or both become the file that is "opened" by this function. In all other respects sys_open() is equivalent to open(2).

It has to be stressed that with this substitution in place, no matter how one opens a file -- with open(), fopen(), ofstream(), etc -- he can submit the extended file names and enjoy their functionality. You don't need Perl or Expect: the piped file names may appear really anywhere where files are open.

One can "extend" file names even further, by allowing http:// prefix or host:port|| suffix. In the latter case, "open" will do listen() first.

The extended file names and the open() substitution is a part of a C++ "advanced" i/o and the arithmetic compression classlib. Its Makefile runs a verification code that tests that the interposition really works.

Examples

  FILE * fp = fdopen(sys_open(" | gzip -best > file.gz ",
         O_WRONLY | O_CREAT | O_TRUNC,0777),"wb");

  (with-input-from-file
   (string-append "   cat " file-name "|  tee transcript  |") ...)

  cout << "\tReading from a datetime port" << endl;
  FILE * fp = fopen("tcp://localhost:13","r");
  if( fp == 0 )
    perror("Opening failed"), _error("Failure");
  cout << "\t\tthe result is: ";
  int c;
  while( (c = fgetc(fp)) != EOF )
    cout << (char)c;
  fclose(fp);

  (let ((io-port (##open-input-output-file
           "| while read i; do echo $i | sed 's/[Ff]oo/bar/g'; done ")))
    (cerr "\t\tsending pattern: " orig-string nl)
    (display orig-string io-port)
    (display #\newline io-port)
    (flush-output io-port)
    (cerr "\n\t\tdone sending; receiving the result\n")
    (with-input-from-port io-port
      (lambda () [elided])))

Version

The current version is February 2013.

References

 


Last updated December 5, 2008

This site's top page is http://okmij.org/ftp/

oleg-at-okmij.org
Your comments, problem reports, questions are very welcome!