Saturday, July 25, 2009

Playing with DVCS for wxWidgets

Just wanted to share my recent experiences with trying to use a DVCS for wx. I was interested in this because I often need to test some change on multiple platforms before committing it and currently what I do is to do the modification in a svn checkout on one machine, test it there, then make a patch, apply it to svn checkout on another one, test there, then commit from the first one, undo the patch on the other one and update from there -- this is not really complicated but certainly involves many more steps than I'd like. And this is the simple case when the patch actually works, if it doesn't and if you need to make changes to it, it's too easy to get entangled in multiple copies of the patch and get lost.

On the other hand, I'm using since some time Mercurial (also known as hg) for my own projects and enjoy its simplicity and how easy it is to create a separate clone of the repository for the changes. It's also so much faster than svn for a lot of common operations such as viewing the file history, annotating it or finding a given log message (which it can do by keyword, unlike svn). So I decided to try using hgsubversion, a bi-directional Mercurial-SVN gateway, for wx.

Unfortunately it didn't go great. Importing wx tree took a long time (~30 hours) and ran out of memory a few times as reported here, resulting in the process being killed by Linux out of memory killer (one of few times I've been actually glad to have it happen as a process consuming 8GB of memory without any good reason does deserve to be killed). Of course, this is a one-time only operation and so it doesn't matter much but, still, it was hardly a great start. More importantly, though, working with this setup turned out to be inconvenient in practice because cloning the entire wx tree does take time, even with hg efficiency, especially to a different machine. It can hardly be otherwise considering that the full cloned tree is 1.7GB -- it's not that much in absolute as svn checkout of the trunk (only) is 330MB, while hg tree contains all project versions and not just the latest one, but it's still a lot and, as we'll see below, it can be much better.

An alternative could have been to use Mercurial named branches but they are not meant for the private changes, i.e. using them would leave traces in svn history which is really not ideal (these branches are for my own personal testing and I do not want the others to see how many mistakes I made while doing a trivial change!). Or there is Mq extension which is supposed to be one of the greatest things about Mercurial but unfortunately I could never get used to it and it just doesn't seem right to me to use what is basically an orthogonal VCS on top of the one which is normally used. And the patch queue is local to each repository so with it I'd basically be reduced to copying patches around again. Maybe the most promising extension is the pbranch one, it really does seem to allow to do what I need. But it's non-standard, I'm unsure about its further prospects and it seems rather complicated thus negating the main advantage of hg -- its simplicity.

So, with a heavy heart, I turned to another popular DVCS: Git. I think it could be described as Mercurial evil cousin. While Mercurial is as easy to use as it could be and has great documentation, Git is almost perversely complicated. It has concepts which are particular to it only (can anyone really explain what purpose does the index existence serve except for confusing new users and occasionally tripping more experienced ones?). Its included documentation is only useful if you already know very well what you are doing. It allows (I think it encourages, really) you to make errors -- which is, of course, fine, as there are 3 or 4 different ways to undo them. Of which 2 (different ones, depending on situation) make things even worse. It seems to enjoy reusing commands commonly used in other VCS to do something different. Even the commands which seem to do what you'd expect (e.g. pull and push) do not. Moreover, they are not really even opposites of each other. So you never know what a command with a simple name does and you never risk finding any other commands without reading half a dozen of git tutorials. And even then you have to remember that the equivalent of hg histedit is git rebase -i (with rebase in general doing something completely different, of course). And using git means having one extra letter to type for every command compared to hg!

So ever since I found Mercurial I never seriously considered using Git. While I agree that Git is more powerful, having 37 different ways to shoot oneself in the foot is not really what I'm looking for in my VCS. Unfortunately, Git does have one killer feature: local branches. This is exactly what I need when working with wx svn and is close to what Mercurial pbranch extension does. Except, in this particular case only, Git is actually simpler. And faster.

Speaking about faster: importing wx svn using git-svn took "only" 12 hours. And never consumed any appreciable amount of RAM. And, a really pleasant surprise, the git repository of wx is only 400MB -- that is hardly bigger then svn checkout of a single trunk revision (while git repository, like the hg one, contains all versions of all branches in the project) and more than 4 times smaller than hg. In spite of myself, I was impressed. Think about it: this means that if you have both 2.8 and trunk checkout of wx you actually save 200MB of disk space by using Git -- while gaining all the advantages of having the entire project history locally (which is the reason for which switching between 2 branches in git is practical but using svn switch is not). And if, like me, you have 4 branches checked out (2.8, 2.9.0 (well, hopefully not for much longer, this one), SOC2009_FSWATCHER and trunk), the space savings becomes really noticeable (almost 1GB).

But it gets better: "cloning" (creating a new local branch) with git is instantaneous. Switching to another existing branch (e.g. 2.8 one) is much faster than with hg. Even updating from svn seems to be faster, although here the difference is not really significant (using the usual hg pull instead of git svn rebase is significant advantage of Mercurial though -- but unfortunately it's easier to get used for idiosyncratic syntax (yeah, and committing is done with git svn dcommit -- I'm sure there is a logical explanation for this extra "d", too...) than to slowness).

So I'm using git as my svn client for now (all of 2 days). And I'm ashamed to say I love it. Of course, hg is great compared to svn too. But I can't realistically use it with svn right now and I can do it with git. And so I don't have to jungle with patches any more. And the coloured output of git diff is so much easier to read than svn diff (and even than hg diff with colour on, as git also nicely highlights white space errors). Now if only I didn't forget to use that --cached option half of the times...

To summarize, I wholeheartedly recommend using Git as a client for wx svn repository. If there is any interest in it, I could push my repository to Github (it's bigger than their 300MB limit for free plan but I hope they could make an exception). But even if you need to run git-svn yourself, it's still great to have a local git repository if you plan on submitting (or even just having them privately) patches to wxWidgets. Of course, any DVCS could be used to have this extra freedom of working with wx in any way you want. But while I still hope hg implements local branches in the future and hgsubversion improves (there doesn't seem to be much point in hoping that git interface becomes logical), for now Git is the best choice of a DVCS to use with wxWidgets.

Monday, July 13, 2009

Blogging about logging

I've just finished a series of changes which were meant to make wxLog less embarrassing and more useful. Of course, wxLog was always meant to be a simple logging framework adapted for typical logging patterns of GUI applications but there is such thing as being too simple and it became apparent since quite some time that wxLog was insufficient for any kind of application using multiple threads or even simply separated in multiple components whose logging should be controlled simultaneously. And as most applications nowadays do use multiple threads, this is a serious limitation indeed.

As an aside, when I realized that the deficiencies of wxLog really prevented it from being useful in the application I was working on, my first idea was not to enhance it but to switch to another, dedicated logging library. But incredibly enough I couldn't find any good candidate: there are tons of libraries based on log4j but translating Java API in C++ is really not a good idea and I hoped to find something more idiomatically C++-ish. So I naturally turned towards Boost and found not one but two libraries named "Boost.Log", with one even confusingly called "Boost.Log v2" despite being older than the other one. Unfortunately, while both of them are undoubtedly great libraries, I was completely overwhelmed by their complexity. They are certainly great and allow some things I wouldn't even think of if I were creating a new logging library from scratch, e.g. a possibility to associate a decrementing counter starting from 100 with step of -5 with every log record which is extremely impressive but also doesn't seem to be especially useful in practice and I'd prefer to just simply use a logging library instead of admiring its marvellous elegance. So I passed them too -- and decided that while wxLog might be too simple, keeping it simple enough was still very important.

With this in mind, I decided to simply fix the few most glaring omissions in wxLog:
  1. Lack of support for logging from threads other than main.
  2. Impossibility to treat logs from different parts of application differently.
  3. Absence of __FILE__, __LINE__ and __FUNCTION__ information.
The first one was already solved for some logging targets, e.g. wxLogWindow was already thread-safe as it collected the messages coming from other threads and really displayed it in its text control only during the idle time from the main thread. All I did was to extend this approach to all log targets by moving its implementation in wxLog itself.

This does introduce a new problem however: as the messages are buffered instead of being output immediately, they could be lost if the program crashes before the main thread has a chance to output them. So I also added a concept of per-thread log targets which can be associated with a single thread only and don't need to do any buffering. Of course, such target can't show messages to the user -- as this can only be done from the main GUI thread -- but it can log them to a file and so a thread can always set up wxLogStderr or a wxLogStream to ensure that its messages are saved in a file as soon as they are output.

On a related note, using wxLogNull (and wxLog::EnableLogging() which it uses internally) now only disables logging for the current thread and not the application as a whole. This makes sense as if you just want to suppress an error message from a wxWidgets function you're going to call, you shouldn't disable all the logs from the other threads of your application which can be doing something completely unrelated while this function is executing. The initial plan was to also add a new way of disabling the logging globally but after thinking about it for quite some time I couldn't find any realistic use case when doing this would be really useful so for now logging can only be enabled thread-wise -- but we can always make it possible to disable it either globally or, which probably makes more sense, on log target basis, if really needed.



The second problem was solved by introducing the notion of "log components". These are simply arbitrary strings which identify the component which logged a message. By default, messages logged by wxWidgets come from the log component "wx" and its subcomponents, that is strings starting with "wx/" like, for example, "wx/net/ftp", while messages generated outside of wxWidgets have empty log component as it's not defined by default. This is already useful as sometimes you may want to treat wxWidgets and your own messages differently, e.g. you could disable all non-error messages from wxWidgets by setting the log level of the "wx" component to wxLOG_Error while keeping all messages, including the debugging ones, from your code enabled. But this feature becomes really useful mostly when you do define your own custom log components. This is done simply by #define-ing wxLOG_COMPONENT before using wxLogXXX() functions. It can be done on the compiler command line (to ensure that the same value is uniformly used everywhere) or inside the source files. In either case you will probably want to use different values for different parts of your application, e.g. "myapp/ui" and "myapp/db" and "myapp/network" and so on. And then you can independently configure the log level for each module and, also importantly, you can distinguish between the messages logged by different components and send them to different final destinations (e.g. database-related messages to one log file and network ones to another) from your overridden wxLog::DoLogRecord().

Finally, to solve the last problem in the list, all wxLogXXX() functions have been replaced by macros with the same names, which allows to record the information about the log message location. It can be retrieved from DoLogRecord() from the wxLogRecordInfo struct passed to it. By default, this information is not used in any of the predefined loggers (yet?) but it's available in case you nee it.

Moreover, in process of doing this, I actually created a relatively generic mechanism for passing arbitrary extra information to the log functions -- but, still remembering my experience of reading Boost.Log documentation, I decided to not make it public for now and to keep things simple.

After all, with the additions mentioned above wxLog is already much more useful and hopefully it's good enough for even complex wxWidgets applications now. And if not, we'd be interested to hear about still missing features, of course, so do have a look at the improved wxLog version in svn trunk and let us know what do you think!

Sunday, July 05, 2009

June News

Here is a brief summary of changes in wxWidgets during the past month. Once again, we (and I in particular) didn't do time to do as many things as we'd like to but less is better than nothing. And, in case of the most important new feature added, later is hopefully better than nothing as I seem to remember requests for it at least 10 years ago -- and now, finally, we do finally have ... drums roll, please ... support for images in wxButton:













(the images, and hence button sizes, are different in the screenshots above because the standard wxART_INFORMATION icon is used which is platform-dependent).

Another important even if somewhat technical change was the harmonization of handling of different background styles under all (major) platforms, as discussed here. As a side effect, this allows background bitmaps in wxHtmlWindow to work again in wxOSX/Carbon.

Other miscellaneous changes:
  • Some wxFont convenient methods such as Bold(), Larger(), Smaller() and non-const versions MakeBold(), MakeLarger(), MakeSmaller() were added: see "Similar fonts creation" section in wxFont documentation

  • There were several additions to XRC:


  • wxDirCtrl gained support for multiple selections (thanks to Steve Lamerton)

  • wxStandardPaths behaviour under Windows is now more flexible, see its new IgnoreAppSubDir() method.

  • wxVariant was improved to support wxLongLong.



Speaking of wxVariant, Jaakko is currently working on the new, better, safer and more efficient replacement for it called wxAny. Any feedback about it would be very appreciated as we'd really like to make a class we wouldn't be later ashamed of (which is unfortunately a feeling I often have about wxVariant).

Finally, the usual statistics: there were 418 commits to the repository, 95 tickets were created or reopened and 70 tickets were fixed (hmm, I wonder if the label "progress" is this still applicable?).