Friday, August 08, 2008

Fun with benchmarks

I was looking into optimizing the new UTF-8 string class in wxWidgets 3 and I had to decide about the most efficient way to cache information about mapping UTF-8 positions into byte offsets. So I wrote a simple benchmark to measure the overhead of using the thread-specific variables compared to using normal globals.

To be precise I wrote a simple loop updating the variable (which is, of course, not at all realistic but that's micro benchmarks for you) using the following methods:
  1. Direct global variable access

  2. Using compiler thread-specific variables support (__thread for g++ and __declspec(thread) for MSVC)

  3. Using OS-specific TLS support (Win32 TLS or POSIX threads TSD)

  4. Using boost::thread_specific_ptr


The results were somewhat surprising, although also encouraging: under both Win32 (x86) and Linux (amd64) platforms the first two ways were the fastest. The OS functions were 3 times slower under Windows and 5 times slower under Linux. Boost implementation was disappointing, at least from performance point of view as it was 2 times slower than OS functions, making it 6 or 10 slower than the fastest version. This is bad news as I hoped to avoid writing a wxWidgets-specific TLS class and just use Boost version but this doesn't seem a good idea for performance-sensitive code.

But the biggest surprise, at least for me, came from the comparison of the first two approaches: using compiler support for thread-specific variables turns out to be faster than using plain old globals. This was so unexpected that I even checked the disassembly to see if I wasn't missing anything and it turns out that gcc generated exactly the same code for both versions except that in the thread-specific version it used FS-relative addressing to access the value. For MSVC the code wasn't quite the same but it also used FS for thread-specific variable. So it looks that under both x86 and amd64 using FS register is actually faster than using normal absolute addressing.

In any case, it's good to know that having thread-specific variables brings no performance loss when they are supported by the compiler. Of course, my benchmarks are very specific and, last but not least, they don't have any thread running. However I think the results should be broadly true for more realistic code which I'm going to benchmark once the real caching implementation is written.

Tuesday, May 27, 2008

What you can learn by reading Slashdot

Slashdot had an interesting post today linking to a project which studied the degree of separation between different Wikipedia articles. Possibly moved by the subconscious feeling of shame due to the fact I was reading Slashdot instead of working on wx I decided to check what does it have to say about the closeness of wxWidgets to paradise. And the result was:

Shortest path from wxwidgets to paradise
WxWidgets
November 27
Eastern Orthodox Church
Paradise
3 clicks needed

I say we're pretty close to the goal! Even better, while we're only 3 clicks removed from Qt too, Qt itself is 1 click further from paradise than wx is. They have some work to do (but then maybe this is what they're doing, instead of reading Slashdot...)

Tuesday, April 29, 2008

Google Summer of Code 2008 and wx

Since a couple of years, there is something we really look forward to each spring. I'm not speaking about blossoming flowers and all this nonsense, of course, but about the Google summer of code. This is a wonderful program which gives many open source projects, including wx, an opportunity to work on some projects which wouldn't be started otherwise because they require initial investment beyond what we can normally afford and attract new contributors to the project.

So it's great to be part of GSoC again, even though -- let me perform some ritualistic whining -- we got only 2 slots this year while we had 3 of them during the previous ones. I'm not sure why did this happen (probably because ever more projects take part in the program and even Google's budget is not unlimited) but it might be related to the relatively few students proposals we received this year (about a dozen and 5 of them about the same project). Of course, there were still some worthy proposals which got cut (at least the third and the fourth one) but I still think that one of the goals for the next year should be to prepare more interesting projects and try to attract more interest -- and if you have any great ideas, please let us know or just add them to this wiki page.

Of course, while wxWidgets itself has 2 slots, there are other wx-related projects undertaken by the other organizations. wxPython has 3 more slots, Audacity uses one of its 5 for writing new draggable sizer classes which could hopefully be reused by other projects and even Perl has wxCPANPLUS project using wx. So the life is still good.

But it will be even better once our 2 projects are completed. Personally I'm mostly excited about the "duller" one of them -- the bug fixing one by Marcin Wojdyr. We currently have an indecent number of open bugs (~1600 at the last count) in our bug tracker and this is way too much, in fact there are so many of them that even finding an existing bug is difficult. This is partly due to the extremely poor SourceForge bug tracker UI so one of the first steps in this project will be migrating all our bugs and patches to Trac. And then, of course, fixing some of them. This is probably not as exciting as writing some great new control but almost certainly is more useful to wxWidgets and will do more for our users so, once again, we're really glad to have the opportunity to do it during this GSoC!