tech & sp: 2009

2009-12-20

Rest in peace Flo: 13.11.1986–16.12.2009

Today is a sad day. Everything feels like I am having a bad nightmare. That's because today I learnt from the too early death of my friend Florian Hufksy.

I am sitting here and do not really know what to write. I keep on thinking about the great times we spent together. The time we started programming when we were twelve. The time we spent learning BASIC. All the times you knew more than me and could teach me a thing or two. I remember our geek talks. How we would discuss latest games. How we lost contact and how we met again. I am thinking about how sorry I am for not having met you often enough. I keep on trying to understand what drove you that far. How you could just end it all. More and more memories come to my mind, like the moment when you showed me one of your projects, Super Mario War. The moments we had playing video games together. All those moments, all that time, I miss you my friend. You were a genius, always a step ahead, not only of me, but seemingly the whole world. I can't stop thinking about your brilliant ideas and how you always finished your projects. You were a real hacker, a real genius, a person trying to make the world a better place, a person who will be missed, not only by me.

You were a genius and I always respected you, not only as a hacker, but as a beloved friend. Why did we not spend more time together? Why did you have to go? Why do I have to write this now, sitting here in my chair with tears in my eyes? And all those memories come up again and again. There is so much more that comes to my mind, but I can't keep on writing, it just hurts too much.

The world is a sad place today. I am sad. I am mourning the too early death of my beloved friend, Florian. You will always have a special place in my heart.

2009-12-09

ujail: use cases, FAQs, part 1 & proof of concept, part 2

As I ran out of time whilst writing the "introducing ujail" post on monday I would like to further elaborate on the idea, giving you some examples of possible use cases and then having a look at FAQs regarding ujail. Additionally I have created a second proof of concept that should be a lot faster, see below for more details.

Use cases of ujail

Monday's post was rather technical, so let's have a look at possible use cases today.

The main reason for both having the idea of ujail and starting working on it is my web server. I am running quite a few (S)CGI scripts there and, even though running them as different users, on a per-vhost basis, I have the impression of the whole thing being a bit insecure.

Okay, PHP does provide its famous open_basedir feature, but I am also running some Python applications which I simply cannot restrict easily. My first ideas involved adding something similar to open_basedir to Python, followed by the idea of replacing some C library functions, like fopen and friends on startup time.

Whilst the adding open_basedir to Python would have involved changing a lot of Python's internals I soon discarded the library patching idea as those could be worked around by injected code directly invoking syscalls. It didn't take long for me to notice that I have to dig deeper. The idea of ujail was born and after coming up with the proof of concept this seems to be a viable solution.

Now ujail is not only about protecting a web server from its web applications, but could do a lot more, for example:

Creating a sandbox for untrusted code (socket&file i/o emulation)
Implementing some sort of personal firewall (socket-call only emulation)
Testing applications that perform low-level system operations (read: package managers and friends, filesystem emulation)

I am sure you can come up with even more use-cases. What should be noted is that emulating a system call does not mean that one necessarily needs to emulate the whole filesystem. What can be done, for example, is patching through access to common files (libraries, executables, etc.) whilst maintaining a virtual filesystem for data that will eventually be modified. A copy-on-write approach is possible too, for example. There are multiple methods with which the multiple filesystem could be implemented, the most common would probably be using a state directory.

FAQs

There have been some questions about ujail in comments to my first post which I would like to answer. Also, I have been thinking about things that are different about ujail compared to other virtualization techniques. Feel free to add additional questions either in a comment or drop me an email: debian at sp dot or dot at.

Could you change the license of ujail to ... ?

Not likely to happen. The proof of concept's license is GPLv3 and the actual code's license will be too. However, ujail is a userspace application that does not need any modifications to the kernel so there should be no problems with porting ujail from GNU/Linux to any other system.
Does ujail work on operating systems other than GNU/Linux?

Not yet. If it's technically possible to implement the technique on other operating systems I would be happy to accept patches.
Do I need to patch my kernel for ujail to work?

No, ujail is running in userspace. The only thing it needs is Linux with support for PTRACE_SYSEMU.
How is this approach different from using LD_PRELOAD?

With LD_PRELOAD one can replace library functions, but malicious code could still directly invoke syscalls, working around this protection completely. Also, statically linked binaries cannot be restricted with LD_PRELOAD.
How is this approach different from user-mode-linux?

User-mode-linux (UML) works by emulating a full kernel in userspace and allows you to virtualize a whole Linux instance (including a new init process, etc). ujail is about providing a way of restricting a single process (and its childs) inside a running system in terms of access to syscalls and the partial emulation of those.
How is this approach different from linux-vserver?

Linux-vserver is a kernel patch and runs in kernel space, as opposed to ujail, which works in userspace.
Also, linux-vserver works similarly to user-mode-linux, providing a fully virtualized Linux instance.
Does the account running ujail need any special privileges?

No, the only restrictions that apply are those of ptrace.
Where is the code?

Right now ujail is in a planning phase, and only the proof of concept code has been written and published. The actual ujail code is yet to be written and the code will be hosted on launchpad.net.

Proof of concept, part 2

An anonymous person (who were you stranger?) added a comment to my first post, suggesting "Also, why patch the process rather than just modifying its state and trapping into the kernel?". I have had a look at this approach earlier, but it didn't work out. However, I decided to give it yet another try and created a second proof of concept. That code does not require patching any code, but only modifies the instruction pointer (eip) and the first register (eax). This should be a lot faster than patching the code.

Technically the new main loop works by calling PTRACE_SYSEMU and waiting for a notification. It then saves the instruction pointer and switches to PTRACE_SYSCALL. As before it waits for the emulated syscall to exit and at this point sets eax from orig_eax and decreases the value of the instruction pointer by the size of the "int $0x80" instruction. Another call to PTRACE_SYSCALL resumes the process. The next event is the process actually entering the real syscall and yet another one leaving the syscall again. These are resumed by PTRACE_SYSCALL and PTRACE_SYSEMU respectively. So, comparing this with the first approach we are only modifying two registers now, instead of writing to the TEXT area of the running process.

Thanks should go to the anonymous commenter for making me give this approach another try.

Questions? Criticism? More ideas? Want to contribute?

Coming to an end I would yet again like to let you know that I am open for questions, criticism, more ideas and contributions in general. So if you are interested in this topic come join the discussion by either dropping me an email, writing a comment to this post or replying to this post on your own blog.

2009-12-07

Introducing ujail & proof of concept

Lately I have been thinking about methods to provide a stripped down, secured environment for running untrusted code on GNU/Linux. With this post I would like to present you with the first results of my research.

ujail - brief introduction

I have chosen ujail as the name for the technique I am proposing. ujail stands for micro jail in userspace and, in itself, describes the concept briefly. The main idea is to have a userspace process monitor system calls of one of its childs and emulate some calls, if needed. This is done using ptrace and namely both PTRACE_SYSEMU and PTRACE_SYSCALL.
The ujail process should not be able to monitor syscalls, like strace does, but also intercept and emulate them.

This sounds a lot like user mode linux (uml), but the method is different. Whilst uml comes with a complete kernel, emulates all system calls and this way provides a virtualized system, ujail is intended to only emulate some systemcalls, without emulating the kernel.

Revisiting PTRACE_SYSCALL & PTRACE_SYSEMU

To better explain how the ujail technique works I would like to have a quick look at PTRACE_SYSCALL and PTRACE_SYSEMU again.

PTRACE_SYSCALL allows a userspace process to be notified whenever a traced process enters or leaves a system call. This means that two notifications are normally sent: one before system call entry and one afterwards. Even though one is able to change the parameters of system calls this method does not allow system calls to be fully emulated (think virtual filesystem here).

PTRACE_SYSEMU on the other hand provides one notification on syscall entry and expects the receiver of the notification to emulate the syscall. This method alone sounds great, but this also means that memory allocation needs to be emulated too, which is quite complex in userspace.

A hybrid of PTRACE_SYSCALL & PTRACE_SYSEMU

Now on to the concept behind ujail. The method I am describing works by calling PTRACE_SYSEMU for a specific process and this way taking over emulation of all system calls. However, some system calls are complex to emulate in userspace, and so a hybrid of both PTRACE_SYSEMU and PTRACE_SYSCALL is needed. In short this works by checking whether the syscall needs to be emulated when the PTRACE_SYSEMU event is received.
Now one way is emulating the syscall, filling the processes' registers and resuming execution of the process. This is simple and straight-forward.

The second way is forwarding the system call to the kernel. The problem here is that calling the syscall in the monitoring process will make the new resources available to that very process, and not the process to be jailed. This is where the hybrid method kicks in.

The proof of concept code creates a backup of the next instruction to be executed along with a copy of the instruction pointer at this point and patches it with the opcodes for "int $0x80", causing the syscall to be made again. After that it resumes execution with PTRACE_SYSCALL and waits again. The first event to be received now is the program leaving the emulated system call, which can be ignored. Resuming yet again will give use two PTRACE_SYSCALL events, one for syscall entry and one for syscall exit.

The first event is not really interesting, but at the second event the opcode backup is restored and the eip set from the saved value. Now the kernel has handled the syscall and the result is ready for the child process. A final call of PTRACE_SYSEMU resumes execution of the child and waits for the next syscall.

Proof of concept

The proof of concept code can be downloaded from its bazaar branch at launchpad.net. It is intended to be used on i386 systems only and works with simple programs, but is known not to work with anything using fork, vfork and most likely will not work for binaries using threading.

Finally, I would like to thank Pradeep Padala for his "Playing with ptrace" articles [0][1], which were fun to read and worked as a great introduction of ptrace for me.

Now there is only one thing left to say: if you are interested in this method, see loopholes or problems or want to contribute, please go ahead and contact me:

debian at sp dot or dot at

2009-12-01

How to copy partitions under GNU/Linux the easy way

After getting a new disk for my Popcorn Hour A-110 device I had to copy all partitions from the old disk onto the new one so I do not have to reinstall some applications and reconfigure everything.

After searching the web and trying to find a free alternative to Norton Ghost and Acronis True Image, preferably not using a boot disk on its own (I did not want to backup my workstation after all, just a simple partition to partition copy between two SATA disks) I gave up and decided to do the copying manually.

So I fired up gparted to do the partitioning, did a right click and... I noticed that gparted supports copy/paste. Being curious about what this could potentially do I gave it a try. I marked partition one on the old disk, did a copy, went to the new disk and clicked on paste - and guess what, gparted did what I was looking for.

Putting a long story short: you can copy whole partitions using gparted's copy/paste mechanism and even resize them whilst doing so. I am somehow ashamed I did not notice this feature earlier, having been a gparted user for a few years now and I can imagine I am not the only one who missed that.

2009-11-11

kvm, qemu and the magic of ubuntu-vm-builder

As I noted two days ago I was unable to build Android on Ubuntu 9.10 x86-64 and thus needed to set up a virtual machine.

At first I went for my preferred virtualization solution, VirtualBox and had to notice that even though I assigned all 4 processor cores of my workstation (along with 2GiB of memory) to the virtual machine building was painfully slow. I immediately ditched the idea of using VirtualBox again and decided to give something new to me a try: the combination of kvm and qemu.

Having Intel VT-x support built into my workstation's processor I thought that this combination should give better performance, and I wasn't disappointed. To be honest, I am astonished on how fast the beast is now. Disk speed still seems to be not as fast as running things natively, but there must be a downside somewhere. :-)

After a bit of googling I also found that ubuntu-vm-builder exists, which simplifies virtual system creation tremendously.

My Android working tree is being synchronized right now, which means that I should be able to start building in a few minutes time. I hope the virtual machine stays as fast as it is right now during the build and I hope everything goes well.

2009-11-09

An update on the proprietary Maemo SDK installer

Yesterday I wrote about my dissatisfaction with the current state of the Android 2.0 code tree and how a proprietary install script for Maemo scared me off.

As suggested in one of the comments to my post I filed a bug report against Maemo, bug 6087.

Besides getting quite a few replies to the bug report within a matter of hours Carsten Munk pointed me at Maemo SDK+, which has less restrictive licensing.

Another comment, by Marius Gedminas (thanks!) pointed me at Mer,

a new operating system for small, mobile touch-screen devices.
It is Linux based and layers the best open-source elements of Nokia's Maemo platform over a modern Ubuntu base.
The goals of Mer include:

Integrate the best solutions for a wide variety of small form-factor devices
Encourage wider access to device capabilities through the Vendor Social Contract
Demonstrably provide an easy route to market for vendors
Dramatically reduce costs to vendors of supporting EOL hardware
Focus, harness and support community contributions to the platform
Encourage and ease migration of existing applications
Support experimentation, innovation and development

My Android repositories

As I wrote in my last post I noticed a few problems with Android's roaming detection code and decided to try fixing it myself.

So, I am basing my work on CyanogenMod, which I am also using on my Android device. My repositories are hosted at github.com/speijnik and you can fetch (nearly) everything you need for building by using repo. See the README file in my android repository over at github for details.

For now only the simplification of the roaming detection code has made it into the repository, but be aware that even though I have published the code I still have neither built nor tried it, as I do not have a working build environment set up yet.

Oh, about the working build environment: there seem to be problems with either the webkit code in the Android repositories (unlikely) or with building that code on Ubuntu 9.10 x86-64 (more likely). Right now I am downloading Ubuntu 8.04 LTS i386 for use in a virtual machine. I will let you know whether that fixes my problems or not.

2009-11-08

Android's roaming detection & its implementation

I know I wrote about Android already today, but there is another thing that concerns me right now. I am owner of an Android-based phone (an HTC Dream) and recently switched my mobile network provider. The problem is that my new provider is a virtual provider and as such there is no real network of that provider. Now Android has a feature to turn off broadband connections when in roaming mode, which itself is a great idea and can save you from paying quite a lot of money when the phone connects to 3G abroad, but this feature also turns off broadband connections when roaming locally. All this is being discussed in bug report #3499.

After noticing this problem I became curious on how Android detects that it is roaming and I found the GsmServiceStateTracker.isRoamingBetweenOperators method to be responsible for that magic, but soon noticed that the method is not only inefficient, but also doesn't work as intended. This is hardly related to the bug mentioned above, but let's have a look at the code in question:

/**
* Set roaming state when gsmRoaming is true and, if operator mcc is the
* same as sim mcc, ons is different from spn
* @param gsmRoaming TS 27.007 7.2 CREG registered roaming
* @param s ServiceState hold current ons
* @return true for roaming state set
*/
    private
    boolean isRoamingBetweenOperators(boolean gsmRoaming, ServiceState s) {
        String spn = SystemProperties.get(PROPERTY_ICC_OPERATOR_ALPHA, "empty");

        String onsl = s.getOperatorAlphaLong();
        String onss = s.getOperatorAlphaShort();

        boolean equalsOnsl = onsl != null && spn.equals(onsl);
        boolean equalsOnss = onss != null && spn.equals(onss);

        String simNumeric = SystemProperties.get(PROPERTY_ICC_OPERATOR_NUMERIC, "");
        String operatorNumeric = s.getOperatorNumeric();

        boolean equalsMcc = true;
        try {
            equalsMcc = simNumeric.substring(0, 3).
                    equals(operatorNumeric.substring(0, 3));
        } catch (Exception e){
        }

        return gsmRoaming && !(equalsMcc && (equalsOnsl || equalsOnss));
    }

Okay, let me summarize what this piece of code does wrong, at least from my understanding:

It takes both the network operator alphanumeric identifier and alphanumeric long identifier and compares both to the alphanumeric identifier coming from the SIM card, whilst...

... it could simply use the network and SIM card numeric identifiers and compare those, which should be a lot cheaper than comparing those strings

Then it takes the first three characters/digits of the numeric identifiers (which indicate the country) and compares those

Now in my case my SIM card doesn't seem to provide the phone with a alphanumeric identifier, so the first two comparisons always fail for obvious reasons and, looking at the inline-if in the last line of that method my phone will always indicate that I am in roaming mode, even when I am not.

The problem is not only the logic which seems to be wrong, but I rather see the inefficient comparisons used there to be a major problem in embedded systems like mobile phones. This is the first piece of Android code I have had a look at, but if all other code is as ugly and inefficient as these few lines Android really needs some major fixes. Related to this I have reported bug #4590 and forked the git repository in question over at github, to fix this method, should be a matter of 5 minutes.

Android, Mythbusters and openness

I have been reading a great many posts about Android lately, some consisting of criticism, some of praise and some simply addressing issues in the Android "community". Let's have a look at those.

Matt Porter's Android Mythbusters presentation and Harald Welte's reaction

I haven't seen the presentation live, but I had a look at the slides. Impressing work done by Matt putting all this information together. However, we all knew that Android only (ab-)uses Linux, without making use of the GNU userland for a long time, didn't we?

In his presentation Matt has shown things such as Android's udev "replacement" that uses hardcoded values for device node creation and (on his blog) Harald has then come up with a statement I have found to be very strong:

The presentation shows how Google has simply thrown 5-10 years of Linux userspace evolution into the trashcan and re-implemented it partially for no reason. Things like hard-coded device lists/permissions in object code rather than config files, the lack of support for hot-plugging devices (udev), the lack of kernel headers. A libc that throws away System V IPC that every unix/Linux software developer takes for granted. The lack of complete POSIX threads. I could continue this list, but hey, you should read those slides. now!

Now both of these statements target technical details, but the root of the problem seems to be elsewhere.

Where is my Android 2.0?

Okay, that heading might not be making any sense in the context of this post at a first glance, but let me elaborate on that. Google and the Open Handset Alliance refer to Android as being an "Open Source" operating system, but the project is different from "real" Free Software projects: development takes place in a closed group and the results are shared with the community later on, when they are deemed to be ready.

This means that innovation also takes place behind closed curtains and that the community is not involved in the actual development process at all. Lately we have seen the result of that, as Motorola is bragging about working close with Google on Android 2.0 ("Eclair"), but the AOSP source trees, open for everyone to have a look at, show no signs of version 2.0. In fact no changes that might even remotely suggest the release of a new major version have been made public in the past few weeks. So where is the openess there?
Actually, the Motorola Droid has already shipped with Eclair on 6th, but still, there is no indication that Eclair will be made available to the broader public.

In short Android seems to be developed behind closed curtains, with hardly (read no) community input whatsoever and is sometimes released as Free Software, not what I would describe as an open development process.

The Android Market problem

As we have seen in the past Google is enforcing their copyright on proprietary applications that ship with pretty much every Android device, such as the Android Market. This has become really clear when Steve Kondik received a cease and desist letter when packing the Google-proprietary applications into his ROMs. Okay, it's Google's right to enforce their copyright and there is nothing wrong with actually doing so, the thing I really have a problem with is something else: the Market is proprietary.

Now what this means should become rather clear. You can have an Android device without Google's proprietary bits, but with default settings you just do not have any way of installing additional software. In my opinion the Market should be freed by Google themselves, or the community has to react and come up with a free replacement to overcome the vendor lock-in. Oh, you might know a replacement called SlideMe (or Mobentoo) already. Well, that bugger is proprietary too, so not a solution at all.

Nokia and Maemo to the rescue

In most discussions about the openness of Android someone throws in Nokia and Maemo, as a solution to the dilemma. Reading all those positive comments I simply had to give it a try, but all my hopes were destroyed within a few minutes.

Let's start with the good news and let alone the reason why my hopes were destroyed for another minute or two. Maemo is based on Debian GNU/Linux and various Free Software components, such as GTK+, gstreamer, esd and friends. Most of the system is Free Software which is a good thing(tm) and reading all of this really got me into Maemo. Okay, some applications seem to be proprietary, but I am sure that could be fixed rather easily, so I could once for all use a truly open phone.

...and then came the SDK installer shell script:

#!/bin/sh
# Copyright (C) 2006-2009 Nokia Corporation
#
# This is proprietary software owned by Nokia Corporation.
#
# Contact: Maemo Integration <integration@maemo.org>
# Version: $Revision: 1110 $

Now there is one question you should ask yourself: Why would someone trying to promote his platform as being open make the *installer* script for its SDK proprietary? Come on, it's an installer script, how much of your secret juice could be in there? What's the problem with people modifying it and working on this installer script in an open development environment?

I had high hopes for Nokia actually doing a bit better than Google, but it seems they've failed to do so. It may be me overreacting, but a proprietary SDK installer shell script scares me enough not to install the SDK and have a look at it for now nor to think about buying a Maemo-based device in the near future. Please Nokia, either get the facts straight or provide us with a free SDK to your free & open platform.

So, in short, Google is bad at working with the community and creating a truly open development process, and Nokia simply fails in terms of not scaring off prospective developers for their open platform with the proprietary SDK installer. Do you have any solutions in terms of an open phone environment, apart from what OpenMoko has come up with?

2009-11-04

How to move panels in Gnome 2.28

I just installed Ubuntu Karmic Koala on my workstation and came across the problem of not being able to move/drag Gnome panels around in order to have the panels on my primary monitor.
On the Debian system that was powering the workstation before this was a non-issue as I could simply click, hold and drag both the upper and the lower panel, but this didn't work.

So, after a few minutes of googling I came across an entry at answers.launchpad.net[0] and a blog post, but I cannot seem to remember the URL to that one. I can imagine that some of you might be having the exact same problem, so the solution is holding down the ALT, whilst dragging as usual.

[0] https://answers.launchpad.net/ubuntu/+source/gnome-panel/+question/264

2009-08-17

Automagic bug reporting in Python applications for Debian

We all know this situation: a program crashes and you need to send a bug report to the DBTS. The damn bug however is hard to reproduce and you fail to do so and hence can't submit the report.

This has all changed for update-manager now. With the next upload to unstable update-manager will get automagic bug reporting. In short: there is code that detects uncaught exceptions, asks the user if he or she wants to file a bug report and then invokes reportbug. Nothing too special about this yet. There is one thing that should make lives of both bug reporters and developers easier though: the code automatically includes traceback information, that make finding the cause of the problem a lot easier.

Okay, enough of praising this feature of update-manager, this post is about something else. Ubuntu users and developers might think "apport" now, because apport is an application that provides exactly this, reporting of bugs on program crashes, for all users.

At least for Python applications and libraries in Debian providing this functionality should be easy. The only thing one has to do is create a sys.excepthook implementation that does the bug reporting, just as in update-manager.

The questions I have now are:

Do you think this feature would be a good addition to the Debian distribution?

2009-07-09

update-manager weekly update #6

So finally I have the time to provide you with a weekly update, instead of my usual bi-weekly ones.

Unfortunately I did not work on anything on last week's TODO list, but found other issues I worked on and corrected. So let's have a look at what I've done.

Debian packaging update

I have done some work on the Debian packaging, which allows update-manager to be built using dpkg-buildpackage now. The way packages are splitted is not finalized yet and not up-to-date with my (and my mentor's) idea of how we should do that. You can expect an update to that soonish.

Automatically invoking package list reloading / update check

There is a command line switch (namely -c, or --check) now, that automatically performs an update check on startup. This gives other programs, like software-properties, a way of forcing a check when, for example, the package list sources have changed.

Checking/unchecking all updates in Gtk frontend

Finally the small feature of selecting or deselecting all updates works in the Gtk frontend. Special cases like "all updates already checked" or "no updates checked" yet are handled too, meaning that you can only use one of these methods if it actually makes sense.

Package dependencies in python-apt backend and Gtk frontend

Both the python-apt backend and the Gtk frontend are now aware of package dependencies. This means that when you select an upgrade that depends on another one that other update is selected too. The same works vice-versa too. Additionally the UI now lists all dependencies and dependencies on packages that are not installed yet and automatically deselects all updates that would requires new packages to be installed.

Displaying of overall download size in Gtk frontend

There has been a missing feature (ok, maybe a bug) so that the displayed download size would not be updated in the Gtk frontend. This has been fixed.

Install button being set sensitive correctly in Gtk frontend

In the past the install button would be set to either sensitive or insensitive at startup and not updated afterwards. That means if there were no packages to update when starting update-manager, then checking for updates where new updates are found, the install button would not be set sensitive again. I fixed that too.

Sorting of packages in Gtk frontend

In the Gtk frontend packages were not sorted at all, which meant that finding a specific package was rather hard. I added code that sorts the update list by package name now, which solves this issue.

Bugfixing humanize_size

The humanize_size method, which is responsible for human-readable size displaying in the Gtk frontend contained a major bug so that sizes were rounded. Again, I was able to solve this.

Next week's TODO list

As I didn't find time to work on last week's TODO list my new TODO list is in fact my old one, with additional "Bugfixing" and "Debian packaging" tasks:

Downloading and installing of updates

Bugfixing (?)

Debian packaging

Checking that everything is documented

Even more unit tests

Pylint checking

If time permits and everything else works correctly: working on an aptdaemon backend

The next thing you can expect me to update is the Debian packaging and the documentation, which are my highest priority tasks for now, followed by support for downloading and installing updates.

Happy hacking!

2009-07-02

update-manager weekly update #5

Firstly I have to apologize again for not providing you with weekly update #4, but again I didn't have the time to write one, so this post is going to sum up everything that happened since my last update.

Let's have a look at my previous TODO list:

Documentation

Even though my TODO list entry contained a more detailed entry I have updated the UpdateManager documentation as a whole, leaving only a few blank spots right now.

Ubuntu distribution specific code

I implemented changelog fetching for Ubuntu, which works just as fine as its Debian counterpart now.

More unit tests

There are plenty of unit tests now, but not everything is being tested yet. I am especially proud of my Python interface validation code, that is being used in unit tests to check if handlers implement an interface correctly.

Update list downloading

Checking for updates is what caused me major trouble in the past few days. Basically I had all the code ready, but for some reason the UI froze, with no apparent reason.
However, today I was able to finally identify and fix the problem. As I expected my code was just fine, but python-apt was messing up. I am going to discuss the exact problem and its solution later on, but first: a screenshot. :-)

Note: As you probably noticed I replaced the default progressbar with a pulsating one, because we cannot get exact information on how many items/bytes to fetch and would likely get a progress bar moving backwards, which isn't beautiful.

Further changes

The TODO list was rather short and I did a lot of other work, which I want to elaborate on.

Dynamic selection of frontend, backend and distribution specific modules

Even though this is probably not of any interest to John Doe, it helps a great deal when debugging code as all three components can be selected via separate command line switches now.
Additionally some magic has been put in place that automatically detects the system's distribution and loads the corresponding distribution specific module. This is done via lsb_release and the newly introduced code in UpdateManager.Util.lsb.

Pylint cleanup

Just out of curiosity I decided to start a pylint run on the codebase and quite a few problems were detected, which I then fixed. To be honest though I added quite some code afterwards that probably needs pylint checking and fixes again.

update-manager IPC

My original plan and IPC design involved using callback functions and passing them between the different modules. Even though this worked out fine I had the feeling this wasn't clean enough and decided to ditch this approach and replace it with handler classes.
The handler base classes now provide an interface of methods that are called on certain events and their implementations act accordingly. The main benefit was that I could easily drop a lot of enums and rather have different methods handling different events.

Gtk, threads and python-apt

With the new IPC approach it became easier to use threads that do the actual work in the background, which I had implemented in next to no time, but a few problems showed up.
Whilst cache reloading from within a thread worked just fine checking for updates did not, and until today I didn't know why. I spent a good amount of time debugging this issue, even using python profiling, but nothing obvious showed up. The background process was running, whilst the UI froze.
Today I finally found the root of the problem: python-apt. Even though I assumed that the python-apt worker threads must be stealing CPU time from the thread running gtk.main I wasn't sure how this could be happening, having two completely independent threads.

Now, the cause of all this mess was that Python has a global threading lock and it seems as if this one is *LOCKED* when running C-code, such as the one python-apt comes with. The solution lies in calling Py_BEGIN_THREADS_ALLOW and Py_END_THREADS_ALLOW from within the C code, to release the global lock and let the Python interpreter do some work every now and then.

As with the python-apt acquire code I was able to allow other threads to work as soon as the fetching code starts working and only disallow threads when actually modifying Python objects or calling methods and/or functions. Surprisingly python-apt already made use of this in its cache loading code, but not the fetch progress code.
Fixing this problem took me less than half an hour and you probably can't believe how glad I was to finally get things working again.

UI updates & other changes

Some details in the UI were anything but optimal, like horizontal scrollbars in a few places, which I removed. Additionally I saw the need to move some code out of the Gtk frontend's __init__.py file and to a separate ui.py file.
A full list of all changes I made is available from the bzr changelog at bzr.debian.org.

A few more screenshots

Finally, I would like to provide you with two more screenshots (don't worry about my system being insecure because of not applied updates - this is a testing machine that is not up-to-date on purpose):

Update Manager main screen with details & changelog

TODO list

My TODO list for next week:

Downloading and installing of updates

Checking that everything is documented

Even more unit tests

Pylint checking

If time permits and everything else works correctly: working on an aptdaemon backend

2009-06-22

Python interface validation

When I started working on update-manager I thought using zope.interface for my interfaces was a good idea, but soon realized that it lacked a way of actually validating a given interface against an implementation. The only thing it did was checking whether the implementation defined that it implements the interface.

Now, whilst writing some unit tests for update-manager I came up with a simple way of doing "real" validation, and I would like to share that Python code with you.

Firstly, I'd like to give you an overview of which checks my code carries out:

Mandatory method (raises NotImplementedError in interface definition) is not implemented (also raises NotImplementedError in implementation)

Optional or mandatory method is of correct type (static method versus instance method)

Optional or mandatory method has a different signature (argument count is different)

I consider at least the first and last check viable for validation of an interface against its implementation. The second check I listed is not that useful, and may produce false positives when someone uses certain decorators, I did not carry out any tests on that myself though.

The code can be found in update-manager's repository (link) and (for now) is licensed under the GPLv2 or later. I am willing to distribute this code as a separate Python module (maybe under a more permissive license like the LGPL) if enough (let's say at least two) people are interested in it, so please let me know if you like it.

Apart from the code itself the unit tests in the file linked above should explain how this beast exactly works.

Happy hacking!

2009-06-19

update-manager weekly update #2

First of all: yes, I skipped update #1. I was rather busy with some assignments and exams at university and didn't work that much on update-manager the past two weeks.

Anyways, this update contains everything that has happened since update #0.

Changelog fetching

The changelog fetching code has been added to update-manager. This means that the changelog will be shown in the details section now and should look the same it looked before. However, I have only written that code for Debian so far, but the Ubuntu part is on my TODO list.

Documentation

The documentation has been updated and uploaded to alioth and can be viewed here. I have set up a python environment on alioth which allows building the documentation directly, rather than building it locally and uploading it then. Basically this works by having a separate python packages directory, containing some mock modules that are needed (think gtk and friends here), allowing us to build the docs without having to install all dependencies.
I am planning on elaborating on this method and how to create such an environment in one of my upcoming posts, so stay tuned if you could use something like this too.

Additionally to this environment the documentation has been updated a great deal, including more modules and containing documentation for previously undocumented methods and classes.

Application module

I have reworked some aspects of the UpdateManager.Application module, allowing me to do unit testing on pretty much every aspect of the class. The problem I fixed here is that Application directly called sys.exit when something went wrong and now raises exceptions, which contain the status code and are handled in the respective scripts (ie. "update-manager").

Gtk Frontend and updates from another thread

One thing I fixed was the problem caused by the changelog fetching code running in a separate thread and invoking a callback function that updates the UI. It seems as Gtk isn't that happy when you do this and the UI wouldn't be updated immediatly (it seemed that this only happened after some events, like scrolling the update list). This has been reworked and the callback function now checks if it was called from the main thread or not and calls gtk.gdk.threads_enter/_leave accordingly.

Changelog Viewer

After finishing the changelog fetching code I added the ChangelogViewer widget from previous update-manager versions again, supporting creation of links to launchpad and debian bugs (ie. LP:NNNNNN and Closes: #NNNNNN are now links) and displaying the version number in bold, among other things.

Weeding out UpdateManager.Frontend.Gtk.utils

Initially I just copied over the utils module from old update-manager to the new implementation, leaving every single function in there, but now I decided to weed out the module. The result is that only the functions actually used by this implementation remained in there. Related to this documentation of that module is pending and on my TODO list.

Version number

After a chat with my mentor we decided to bump update-manager's version to 0.200-pre. This should make it easier to distinguish from the old version and indicates that a lot has changed. The first release following the -pre series will be 0.200.0, which should then include all functionality old update-manager included.

My TODO list for next week

Ordered by priority

Documentation of UpdateManager.Frontend.Gtk.utils and .ChangelogViewer modules

Ubuntu Distribution Specific code

More unit tests

Update list downloading in Gtk frontend

2009-06-02

Should CLI debug output and error messages be localized in a GUI application?

Whilst working on update-manager I have been wondering whether I should use gettext for localizing debug output and error messages sent to stderr.
As for debug output itself I basically do not see the need for providing a localized version for each and every message sent to stderr, but as far as error messages are concerned I am uncertain.

The point is that update-manager (apart from its experimental text interface) is usually not launched from a terminal at all and so most users won't even see these messages ever. Also, I believe that every developer's English skills are good enough so that he or she is able to understand simple messages.
Error messages however might be useful to all users when they experience a problem with the software, but localizing those could make handling bug reports a bit harder, possibly having to translate the error message back to English before being able to see what has gone wrong.

So basically I am asking you: What do you think? Is it worth localizing these messages? What is your experience with localized or non-localized error and debug messages?

I would be glad if I could get some input from you, either as a comment to this article, via email to debian(dot)sp(dot)or(dot)at or through the update-manager-devel mailing list.

2009-06-01

sphinx-aware Enums in Python

As I promised to keep you updated on recent developments on update-manager I am writing this article. Just as a disclaimer: I am not going to write about any recent developments here, but would rather like to point at a piece of code I added to update-manager that could be useful in other applications too.

Now, as the title suggests there are sphinx-aware Enums in update-manager. Enums are common constructs in other programming languages like C and allow simple creation of constants with, for example, ascending values (first constant has value 0, second has value 1 and so on). Python unfortunately does not include support for Enums itself, but I found it rather easy to write classes that emulate such a construct.

Nothing is new about Enums in Python and there are probably quite a few different implementations out there, but I believe mine is different. The sphinx-aware part means that my implementation automagically updates the docstrings of the created instances and thus allows sphinx' "autodata" method to include sensible information in generated API documentation.

I could go on writing about and praising my method, but I believe a short example gives you a better idea how my implementation works and what I wanted to achieve with this. Have a look at this page, which is part of update-manager's new API documentation. You should see rather well-looking documentation of the UpdateManager.Backend.RELOAD_CACHE_STATUS NegativeEnum, the defined constants, their values and some additional information about each value now.

Still, nothing too fancy, HTML documentation generated from docstrings. What makes this special is the code from which it was generated:

RELOAD_CACHE_STATUS = NegativeEnum(
  BEGIN = "Started reloading package cache",
  DONE = "Finished reloading package cache")

This not only gives us a RELOAD_CACHE_STATUS enum, along with the RELOAD_CACHE_STATUS.BEGIN and RELOAD_CACHE_STATUS.DONE, but also some documentation, included in RELOAD_CACHE_STATUS' docstring, that can be used by sphinx.

You can find the Enum code, which is rather short and should be quite easy to understand, here. I hope you find this code as useful as I do.

2009-05-28

update-manager on alioth

As I noted in this weeks update-manager progress update one of my tasks was to create an alioth.debian.org project and get my branches uploaded to Debian.

I did not imagine that alioth admins (hi there, a huge "thank you" goes to you guys) would be this fast with reviewing and accepting the project and enabling bazaar support for me.
Anyways, the project has been accepted and its new home is on alioth. I have also already uploaded both my update-manager branch and python-apt branch to bzr.debian.org

Additionally I have generated the API documentation, which is also hosted on alioth, and created a development disccusion mailing list, update-manager-devel at lists.alioth.debian.org.

If you are interested in this project feel free to have a look at what I've done so far and join the development discussion. Comments, critizism and ideas are always welcome.

update-manager weekly update #0

It has been more than a month since I last wrote about my work on update-manager during this year's Google Summer Of Code and I am somewhat ashamed I wasn't able to provide you with updates more regularly.

So first of all, yes, I did do some work and yes, there has been quite some progress. Basically both private and university stuff have kept me from writing and that's why I'd like to start with this series of weekly updates today.
This series are meant to summarize what has happened during a week of writing code and give you an overview of what's happening. This first issue however will sum up the past month.

So let me begin explaining what has happened since my last post.

update manager to become more modular

In my last post I wrote about how I got accepted for GSoC09 and am going to work on update manager. Now I couldn't wait for the actual GSoC09 coding period to start and created my own update manager branch right away and started hacking.

So far I have only written a few lines of code, but my mentor Michael Vogt and me came to the conclusion that whilst working on the internals of update manager it might be a good idea to make the whole program more modular.
Right now all the different functions of update manager (being the UI/frontend and the package manager interface/backend) are mixed up in various files, which makes not only reading the code harder, but also extending update manager more difficult. This was reason enough for me to have a look into making update manager more modular in its design and some of my efforts can already be seen in my update manager branch.

If you have any comments on the proposed backend interface or see major problems with it, please let me know, I would really appreciate some input on that. Also, the UI and the distribution-specific code interfaces are next on my list, before beginning to actually move existing code around. I hope to be able to finish that work before the GSoC hacking period starts, so I can concentrate entirely on my task of making update-manager distribution independent.

2009-04-21

Summer Of Code 2009: Working for Debian

Yesterday Google announced the students and projects that have been accepted for Google Summer Of Code 2009 and guess what: my project was accepted. This means I will be working full-time on FOSS this summer.

So I guess it's about time to introduce my project to you: Distribution-independent update manager, mentored by Michael Vogt (mvo).

Okay, I believe some of you might wonder what this project is all about, as update-manager is in the Debian package archive already. There is a problem with update-manager though. As you see in the package's version number (it contains ".debian") update-manager has been adapted for use in Debian. Also, Debian contains update-manager 0.68 right now, whilst upstream (Ubuntu in this case) has released 0.111.6 (actually there were quite a few upstream versions meanwhile). The reason Debian is nowhere near being up-to-date with upstream is that right now a lot of effort has to be put into porting update-manager to Debian every time a new upstream release is made, because certain Ubuntu-specific functionality breaks update-manager in more or less severe ways on Debian.

This leads me directly to what my project is about: making update-manager (Ubuntu-) distribution-independent, but not package manager independent.
There are 6 main goals for this project, which I will be working on in the order below.

Analyzing the code and identifying Ubuntu-specific parts.

Creating a distribution-plugin interface and moving the Ubuntu-specific parts into a distribution-plugin, creating a core package that is distribution-independent.

Creating a special notification for important/security related updates and providing the code that handles updates from security.debian.org as such.

Creating a backend-plugin interface, moving the synaptics backend into a backend-plugin and optionally create a python-apt based plugin.

UI redesign, providing a simpler interface to average joe, whilst allowing more experienced users to optionally display more information.

Automatic downloading & installation of updates. This is still up to discussion, as automatic downloading is already provided by software-properties (-gtk and -kde) and automatic installation can be handled by unattended-upgrades. Both packages are part of Debian already.

Please note that this list should not be considered final and may be extended or modified over time. It exists to give you an overview of what exactly my project is about and how I am planning on carrying out the tasks.

Finally I wanted to let you know that I will keep you posted on the progress I am making, via this blog. Alternatively a blog aggregator for Debian's GSoC students has been set up over at http://soc.alioth.debian.org/feeds/blogs/, where you can not only find my posts, but those of all of Debian's students.

2009-04-02

Python everywhere: computer games

This is the second article in my series Python everywhere and covers the use of Python for in computer games. The first article of this series covered the use of Python for the conficker worm scanner tool and can be found here.

Problems running PHP as a separate FastCGI process

As some of you might have noticed this webserver has not been that responsive in the past few hours and I have been working hard on getting that fixed. I finally identified the problem and was able to fix it.

The root of the problem was my setup running PHP as a separate FastCGI process. Unfortunatly it seems as if PHP can only handle 500 requests per FastCGI process and then seems to lock up.
The old setup of this site didn't cause such problems and it seems the problem lies in not setting the PHP_FCGI_CHILDREN and PHP_FCGI_MAX_REQUESTS environment variables with the new setup.

Python everywhere: extending applications with Python

Extending applications with Python: gimp, Evolution, Inkscape, Paint Shop Pro, [...]

Python everywhere: A Python Operating System called cleese

Cleese....

Python everywhere: conficker scanner

This article is the first in my new series "Python everywhere".

As this is the first article in this series I would like to explain what the series is all about.
As an avid Python user and developer I want to share my observations whenever I find Python applications doing not-so-unusual things, Python applications running on embedded devices. In the end I want to point out just what the name of this series suggests: Python is everywhere and can be used for everything.

So, straight ahead to the first issue: the conficker scanner.

Introducing pyttpd

In this article I would like to inform you about my newest pet-project: pyttpd.

pyttpd is my effort of implementing a webserver in Python, with a focus on security (through privilege separation), extensibility and scalability.

I started this project because I was not entirely happy with the lack of flexibility and support for privilege separation by popular webservers. Whilst both lighttpd and Apache httpd provide means of running processes under different users these usually require hacks like suexec. Additionally I am somehow curious about how a fully-fledged webserver implemented in Python would perform compared to the mentioned daemons.

UPDATE: AdSense on freedom blog reloaded

I just wanted to inform you that I am in the process of adding AdSense ads to this blog.
However, I am planning on having a one-ad-per-post policy, whilst not placing any ads on the front page.

More details on this topic will follow in the next few days.

UPDATE:

I have now integrated AdSense into this blog. As promised the front page does not contain any ads, but all other pages do. Ads are shown as a widget so they are not in-text and thus should not disturb you whilst reading.

2009-03-29

python-argvalidate has hit Debian unstable

I am proud to announce that python-argvalidate has hit Debian unstable yesterday.

This does not only mean that you can install argvalidate on Debian-based systems more easily now, but also that python-argvalidate has met the strict criteria of the Debian Free Software Guidelines, and as such has been confirmed to be Free Software.

Also, I wanted to let you know that I am maintaining the Debian package itself, which means that updates to python-argvalidate itself will be included in Debian as fast as possible, usually within two days.

How using proprietary software can affect system security

There has been a lot of discussion on whether Free Software is more secure than proprietary software, but I have an additional argument that shows how the use of Free Software can improve system security.

Now you probably expect me to come up with a pure technical reason showing superiority of Free Software, but I am taking another path this time: let's talk about user trust.

A possible attack - what to do about this?

Just as I wanted to start writing an article here and I entered the URL of this blog into my browser I got no response from the webserver, zero, nothing.
First I thought the PHP fastcgi process for this virtual host died, but a quick check on another virtual host suggested that something else was going on.

So I guessed the lighttpd process itself must be experiencing problems of some sort, but after doing a "netstat -nat" I knew what was going on:

tcp6       1      1 83.65.62.72:80          61.135.190.248:12474    LAST_ACK
tcp6       1      1 83.65.62.72:80          61.135.190.234:39671    LAST_ACK
tcp6       1      1 83.65.62.72:80          61.135.190.253:39211    LAST_ACK
tcp6       1      1 83.65.62.72:80          61.135.190.234:55160    LAST_ACK
tcp6       1      1 83.65.62.72:80          61.135.190.230:25836    LAST_ACK
tcp6       1      1 83.65.62.72:80          61.135.190.231:16865    LAST_ACK
tcp6       1      1 83.65.62.72:80          61.135.190.232:24266    LAST_ACK
tcp6       1      1 83.65.62.72:80          61.135.190.240:38441    LAST_ACK
tcp6       1      1 83.65.62.72:80          61.135.190.243:17726    LAST_ACK
tcp6       1      1 83.65.62.72:80          61.135.190.241:38206    LAST_ACK
tcp6       1      1 83.65.62.72:80          61.135.190.251:23892    LAST_ACK
tcp6       1      1 83.65.62.72:80          61.135.190.225:29675    LAST_ACK

Plus "a few" more of those. Now I'm not entirely sure whether it's just some systems misbehaving or actually an attack, but my feelings told me this could have been intentional after all.
I did a quick whois on one of those IP addresses and came up with the 61.135.0.0/16 network which is owned by China Network Communications Group Corporation.

As the connections were made from pretty much every host in that network I had two choices: sit it out or block it.

I came to the conclusion that blocking the entire subnet from connecting to this system, at least temporarily, might be a viable solution and so I did.
However, afterwards I am asking myself whether I really had to block an entire 16-Bit network, so I am asking you: how do you handle such situations usually?

2009-03-24

python-argvalidate 0.9.0 released

Even though I planned providing a release candidate first, which can be seen in the project's Mercurial changelog I have released python-argvalidate 0.9.0 today. Tarballs can be obtained from the Python Package Index (pypi), as usual.

Presented in H^H^H^H^HIPv6

I just wanted to let you know that this blog (actually all webpages I am hosting) are now accessiable via IPv6. Additionally, my mail-server now also accepts IPv6 SMTP and IMAP connections, allowing communication with the IPv6-world.

The setup uses SiXXs as tunnelbroker, with AMIS being the SiXXs PoP in use.
If you experience any problems with the services I am providing via IPv6, please let me know, either via a comment to this article or an email to ipv6@sp-its.at.

Freedom blog reloaded launch

Welcome to my new blog, "freedom blog reloaded".

Now with this first article I would like to elaborate on the name of the blog, the purpose and what you are likely to find here in the future.

Okay, let's start straight ahead with the name of the blog. Freedom in the blog's name refers to Free Software, which is going to be the main topic of the articles you will find here.
I would like to keep you informed about my involvement in the Free Software community and hopefully provide you with some useful information when it comes to configuring and running Free Software.

Now you might still ask what the "reloaded" part in the blog's name is about. Well, I have done some blogging in the past, but due to various reasons didn't have the time to provide my readers with a constant flow of articles, but this should change now. I am planning on regularly keeping you informed.

On to the last thing I wanted to write about: the kind of articles you are likely to find here in the future.
I am planning on writing posts on development in the Free Software community, updates to the Debian GNU/Linux packages I either maintain or co-maintain, the projects I am working on and last but not least some tips and tricks when it comes to day-to-day operation.

Lastly, as this is a blog dedicated to Free Software it's a good idea to let you know that this blog is being run on a Free Software stack completely and I am using Free Software only to write articles.
The setup is as follows: Running on a Debian GNU/Linux system is lighttpd, my webserver of choice, and builds, along with PHP5 and MySQL, the base for running Wordpress, a blogging system written in PHP.
For writing articles I am using, guess what, a browser, namely Iceweasel (also known as Firefox to non-Debian users), running on my Debian GNU/Linux workstation.

I guess that's it for now. As a last note I would like to point out that even though comments have been disabled for this article I will enable them for all posts where discussion makes sense.

-- Stephan