Introducing ujail & proof of concept

Lately I have been thinking about methods to provide a stripped down, secured environment for running untrusted code on GNU/Linux. With this post I would like to present you with the first results of my research.

ujail - brief introduction

I have chosen ujail as the name for the technique I am proposing. ujail stands for micro jail in userspace and, in itself, describes the concept briefly. The main idea is to have a userspace process monitor system calls of one of its childs and emulate some calls, if needed. This is done using ptrace and namely both PTRACE_SYSEMU and PTRACE_SYSCALL.
The ujail process should not be able to monitor syscalls, like strace does, but also intercept and emulate them.

This sounds a lot like user mode linux (uml), but the method is different. Whilst uml comes with a complete kernel, emulates all system calls and this way provides a virtualized system, ujail is intended to only emulate some systemcalls, without emulating the kernel.


To better explain how the ujail technique works I would like to have a quick look at PTRACE_SYSCALL and PTRACE_SYSEMU again.

PTRACE_SYSCALL allows a userspace process to be notified whenever a traced process enters or leaves a system call. This means that two notifications are normally sent: one before system call entry and one afterwards. Even though one is able to change the parameters of system calls this method does not allow system calls to be fully emulated (think virtual filesystem here).

PTRACE_SYSEMU on the other hand provides one notification on syscall entry and expects the receiver of the notification to emulate the syscall. This method alone sounds great, but this also means that memory allocation needs to be emulated too, which is quite complex in userspace.


Now on to the concept behind ujail. The method I am describing works by calling PTRACE_SYSEMU for a specific process and this way taking over emulation of all system calls. However, some system calls are complex to emulate in userspace, and so a hybrid of both PTRACE_SYSEMU and PTRACE_SYSCALL is needed. In short this works by checking whether the syscall needs to be emulated when the PTRACE_SYSEMU event is received.
Now one way is emulating the syscall, filling the processes' registers and resuming execution of the process. This is simple and straight-forward.

The second way is forwarding the system call to the kernel. The problem here is that calling the syscall in the monitoring process will make the new resources available to that very process, and not the process to be jailed. This is where the hybrid method kicks in.

The proof of concept code creates a backup of the next instruction to be executed along with a copy of the instruction pointer at this point and patches it with the opcodes for "int $0x80", causing the syscall to be made again. After that it resumes execution with PTRACE_SYSCALL and waits again. The first event to be received now is the program leaving the emulated system call, which can be ignored. Resuming yet again will give use two PTRACE_SYSCALL events, one for syscall entry and one for syscall exit.

The first event is not really interesting, but at the second event the opcode backup is restored and the eip set from the saved value. Now the kernel has handled the syscall and the result is ready for the child process. A final call of PTRACE_SYSEMU resumes execution of the child and waits for the next syscall.

Proof of concept

The proof of concept code can be downloaded from its bazaar branch at launchpad.net. It is intended to be used on i386 systems only and works with simple programs, but is known not to work with anything using fork, vfork and most likely will not work for binaries using threading.

Finally, I would like to thank Pradeep Padala for his "Playing with ptrace" articles [0][1], which were fun to read and worked as a great introduction of ptrace for me.

Now there is only one thing left to say: if you are interested in this method, see loopholes or problems or want to contribute, please go ahead and contact me:

debian at sp dot or dot at


  1. If this also works on BSD systems, it would be nice if you could license it under a BSD license, like the one used by FreeBSD. And ptrace() appears to be available on BSD systems.

  2. I haven't checked whether FreeBSD supports something like PTRACE_SYSEMU, which is required for this method to work. Anyways, it's unlikely that I'll relicense the proof of concept and the actual code is yet to be written.

  3. Have you looked at utrace yet? It might make this a lot easier.

    Also, why patch the process rather than just modifying its state and trapping into the kernel?

  4. @anonymous commenter:

    Thanks for your input, I have just uploaded a second proof of concept that only modifies the state (=EIP). This seems to work too and should obviously be a lot faster.

  5. Cool! This approach could be very useful, particularly on systems lacking KVM support for which UML appears to be the most efficient option. (The only other lightweight tool along those general lines I've encountered is http://fakechroot.alioth.debian.org/ , which relies on an LD_PRELOAD hack and as such cannot provide a proper sandbox; in particular, statically linked binaries necessarily slip through the cracks.)

  6. Hi,
    do you know ViewOS and KMView?

  7. @Aaron:

    LD_PRELOAD can be worked around using code that directly invokes syscalls, so this is a bad solution.


    I didn't know those two yet, thanks for letting me know. KMView requires a kernel modification, so this is a lot different from ujail (which runs in userspace entirely and does not require you to modify your kernel in any way). UMView looks a lot more like ujail, but seems to do a bit more. I would like to keep ujail as simple and lightweight as possible.