Partially Applied Functions in C 2013-07-20

There are some functions in the standard C library that takes a function pointer to be used as a callback later on. Examples include atexit() and signal(). However, these functions can't receive an arbitrary pointer (which could hold some important program state) in addition to the function pointer, so you're left with pesky global variables:

/* You have: */
atexit(foo); /* foo() will have to fetch program state from globals */

/* Instead of: */
static struct program_state state;
atexit(foo, &state); /* foo() now have a pointer to program state */

Turns out that there's a workaround, but it involves some black magic.

I believe the overall mechanism to be quite interesting, however I do not recommend its usage. Not only because the implementation wastes a whole memory page for a callback, but also because I don't want to encourage people to perpetuate this kind of take-pointer-to-function-without-argument nonsense.

I'll try to explain how this contraption works by showing the smaller parts first. I'll begin with the template function. The idea is to have a function whose code can be patched up later -- however that code turns out to be generated by the compiler:

#define PARAMETER_CONSTANT 0xFEEDBEEF
#define FUNCTION_CONSTANT 0xABAD1DEA

static void
partial_template_function(void)
{
    ((void (*)(void *))FUNCTION_CONSTANT)((void *)PARAMETER_CONSTANT);
}

The funky-looking cast basically says "call a function pointer at FUNCTION_CONSTANT with a pointer pointing to PARAMETER_CONSTANT". Of course, if you call this code as is, the program will most likely crash. The idea is that this generates this code (IA32 assembly):

0f00deba <partial_template_function>:
   0:    55                       push   %ebp
   1:    89 e5                    mov    %esp,%ebp
   3:    83 ec 18                 sub    $0x18,%esp
   6:    c7 04 24 ef be ed fe     movl   $0xfeedbeef,(%esp)
   d:    b8 ea 1d ad ab           mov    $0xabad1dea,%eax
  12:    ff d0                    call   *%eax
  14:    c9                       leave
  15:    c3                       ret

Even if you don't know assembly, if you squint a little bit, you can clearly see the magic constants defined in the C code above. By writing a trivial function to patch these magic values to something useful (such as a real function or some real pointer argument):

static bool
patch_pointer(void *code_addr, size_t code_len, void *look_for, void
*patch_with)
{
    unsigned char *code = code_addr;
    intptr_t look = (intptr_t)look_for;

    do {
        if (*((intptr_t *)code) == look) {
            union {
              unsigned char octet[sizeof(void *)];
              void *ptr;
            } patch;

            patch.ptr = patch_with;
            code[0] = patch.octet[0];
            code[1] = patch.octet[1];
            code[2] = patch.octet[2];
            code[3] = patch.octet[3];

            return true;
        }

        code++;
    } while (code_len--);

    return false;
}

And using it to patch the pointers in a page allocated with mmap() (comments and error recovery have been ommitted for brevity; full source code is linked below):

struct Partial *
partial_new(void (*func)(void *data), void *data)
{
    struct Partial *t;

    if (!func) return NULL;

    t = calloc(1, sizeof(*t));
    /* partial_template_function must be declared just before partial_new
     * so that caller_len is calculated correctly */
    t->caller_len = (size_t)((intptr_t)partial_new -
          (intptr_t)partial_template_function);

    t->caller = mmap(0, t->caller_len, PROT_WRITE | PROT_READ,
          MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

    memcpy(t->caller, partial_template_function, t->caller_len);

    patch_pointer(t->caller, t->caller_len, (void *)FUNCTION_CONSTANT, func);
    patch_pointer(t->caller, t->caller_len, (void *)PARAMETER_CONSTANT, data);

    mprotect(t->caller, t->caller_len, PROT_EXEC | PROT_READ);

    return t;
}

The end result will be a function that can be called without arguments -- which will magically call another function with a given parameter:

static void
test(void *data)
{
    printf("Test called with data=%p\n", data);
}

int main(void)
{
    struct Partial *p;

    p = partial_new(test, (void *)0x12341337);
    atexit(partial_to_function(p));

    return 0;
}

Which, when executed, will print:

[l@navi /tmp]$ ./a.out
Test called with data=0x12341337

So there you have it, partially applied functions in C. Useful? Hardly. Interesting? I think so. Fun? Yup.

If you'd like to try, the full source code, with comments and error recovery is available in this gist.

🖂 Send me an email about this blog post
If you liked this post, consider getting me a coffee!